How to Import Dataset in Python
While working in python programs, sometimes we need dataset for data analysis. Python has various modules that help us in importing the external data in various file formats to a python program. In this article, we will see how to import data of various formats to a python program.
- Import CSV file
- Read Excel File
- Read Text File
- Read SQL Table
In this article we will be using pandas. pandas is a powerful data analysis package. It makes data exploration and manipulation easy. It has several functions to read data from various sources. If pandas is not installed in your system, use the following command to install pandas
pip install pandas
Import CSV file
The csv module enables to read each of the row in the file using a comma as a delimiter.
import csv with open("Students.csv", 'r') as file: rows = csv.reader(file, delimiter = ',') for r in rows: print(r)
Output:
['Name', 'Age', 'Grade'] ['ABEL', '20', 'A'] ['BINDU', '21', 'A'] ['CHRISTY', '20', 'A'] ['YOUSUF', '21', 'A'] ['KRISHNA', '20', 'A']
Read Excel File
The read_excel() function in pandas library can be used to import excel data into Python.
import pandas as pd ds = pd.read_excel("Students.xlsx") print(ds)
Output:
Name Age Grade 0 ABEL 20 A 1 BINDU 21 A 2 CHRISTY 20 A 3 YOUSUF 21 A 4 KRISHNA 20 A
Read Text File
The read_table() function is used to pull data from text file.
import pandas as pd ds = pd.read_table("Students.txt") print(ds)
Output:
Name Age Grade 0 ABEL 20 A 1 BINDU 21 A 2 CHRISTY 20 A 3 YOUSUF 21 A 4 KRISHNA 20 A
Read SQL Table
Using pyodb module, we can connect to database servers. This will help us import data from relational sources using a sql query
import pyodbc sql_conn = pyodbc.connect("Driver={SQL Server};Server=serverName;UID=UserName;PWD=Password;Database=sqldb;") data_sql = pd.read_sql_query('SQL QUERY', sql_conn) data_sql.head()
You have to provide valid information like driver details, server, username, password, database in the above piece of code. By passing the required query in the ‘SQL QUERY’ , you will get the corresponding output.