CSIP12.in
Back to List
Calculating...
UNIT 1 : CH 4 Dec 14, 2025

📘 Importing/Exporting Data (CSV & MySQL)

## 1. 📝 Key Definitions

* **CSV (Comma Separated Values):** A simple text file format used to store tabular data. Each line is a row, and columns are usually separated by a comma (`,`).
* **DataFrame:** The primary 2D data structure in Pandas (rows and columns).
* **MySQL:** An open-source Relational Database Management System (RDBMS).
* **Persistence:** The ability to save data permanently so it isn't lost when the program closes.

---

## 2. 📄 Transferring Data: CSV ↔ Pandas

### A. Loading Data: `read_csv()`

Used to read a CSV file into a Pandas DataFrame.

#### 🛠️ Syntax & Parameters

```python
pandas.read_csv(filepath, sep=',', header='infer', names=None, index_col=None)

```

| Parameter | Description | Default |
| --- | --- | --- |
| `filepath` | Path to the file (e.g., `"data.csv"` or `"C:/User/data.csv"`). | Required |
| `sep` | The separator used in the file (e.g., `,` `;` `\t`). | `,` |
| `header` | Row number to use as column names. Use `None` if no header exists. | `'infer'` (Row 0) |
| `names` | List of column names to assign (if file has no header). | `None` |
| `index_col` | Column to use as row labels (index) instead of 0,1,2... | `None` |

#### 💡 Examples

**Ex 1: Basic Reading**

```python
import pandas as pd
df = pd.read_csv("students.csv")
print(df)

```

**Ex 2: File with No Header (Important)**
If your CSV file has raw data starting from line 1 (no column titles):

```python
df = pd.read_csv("data.csv", header=None, names=['RollNo', 'Name', 'Marks'])

```

---

### B. Saving Data: `to_csv()`

Used to save a DataFrame into a CSV file.

#### 🛠️ Syntax & Parameters

```python
dataframe_object.to_csv(path_or_buf, sep=',', header=True, index=True, na_rep='')

```

| Parameter | Description | Default |
| --- | --- | --- |
| `path_or_buf` | File name/path where data will be saved. | Required |
| `sep` | Delimiter to use in the output file. | `,` |
| `index` | Whether to write row numbers (0,1,2...). **Set `False` to hide them.** | `True` |
| `header` | Whether to write column names. | `True` |
| `na_rep` | String to replace missing values (NaN). | `''` (Empty) |

#### 💡 Examples

**Ex 1: Standard Save (Clean Output)**
Most common exam requirement: Save without the index numbers.

```python
df.to_csv("result.csv", index=False)

```

**Ex 2: Handling Missing Data**
Replace `NaN` values with "Absent" in the file.

```python
df.to_csv("attendance.csv", na_rep='Absent', index=False)

```

---

## 3. 🗄️ Transferring Data: MySQL ↔ Pandas

### Prerequisites

To work with MySQL, you need to import the connector libraries:

```python
import pandas as pd
import mysql.connector
from sqlalchemy import create_engine # Recommended for to_sql

```

### A. Reading from MySQL: `read_sql()`

Fetches data using a SQL query directly into a DataFrame.

#### 🛠️ Syntax & Parameters

```python
pandas.read_sql(sql, con, params=None)

```

| Parameter | Description |
| --- | --- |
| `sql` | The SQL query string (e.g., `"SELECT * FROM students"`). |
| `con` | The connection object. |
| `params` | List/Tuple of values to pass to the query (for security). |

#### 💡 Examples

**Ex 1: Fetch Whole Table**

```python
# 1. Establish Connection
conn = mysql.connector.connect(host="localhost", user="root", passwd="123", database="school")

# 2. Fetch Data
df = pd.read_sql("SELECT * FROM students", conn)

```

**Ex 2: Flexible Query with User Data (Syllabus Topic 4.3.2)**
Using Python variables to filter SQL data.

```python
marks_limit = int(input("Enter minimum marks: "))

# Using f-string for query construction
query = f"SELECT * FROM result WHERE marks > {marks_limit}"

df = pd.read_sql(query, conn)
print(df)

```

---

### B. Exporting to MySQL: `to_sql()`

Writes records from a DataFrame into a MySQL table.

#### 🛠️ Syntax & Parameters

```python
dataframe_object.to_sql(name, con, if_exists='fail', index=True)

```

| Parameter | Description |
| --- | --- |
| `name` | Name of the SQL table. |
| `con` | Connection engine (SQLAlchemy is best for this). |
| `if_exists` | Behavior if table exists: `'fail'`, `'replace'`, or `'append'`. |
| `index` | Write DataFrame index as a column (Default is True). |

#### 💡 Examples

**Ex 1: Creating/Appending to a Table**

```python
from sqlalchemy import create_engine

# Create Engine: mysql+mysqlconnector://user:pass@host/db_name
engine = create_engine("mysql+mysqlconnector://root:123@localhost/school")

# Append data to existing table 'inventory', do not write index numbers
df.to_sql('inventory', engine, if_exists='append', index=False)

```

---

## 4. 📊 Conceptual Diagrams & Comparison

### The Data Flow

```mermaid
graph LR
A[CSV File] -- read_csv() --> B((Pandas DataFrame))
B -- to_csv() --> A
C[MySQL Database] -- read_sql() --> B
B -- to_sql() --> C

```

### Comparison: read_csv vs read_sql

| Feature | `read_csv` | `read_sql` |
| --- | --- | --- |
| **Source** | Static Text File | Live Database Server |
| **Connection** | Not required | Requires Connection Object |
| **Filtering** | Loads all data, filters in Python | Filters in DB (`WHERE`), loads specific data |
| **Use Case** | Data exchange, backups | analyzing live app data |

---

## 5. ⚠️ Common Errors & Troubleshooting

1. **`FileNotFoundError`**:
* *Fix:* Check the spelling of the filename or use the absolute path (e.g., `C:/Data/file.csv`).


2. **`ModuleNotFoundError: No module named...`**:
* *Fix:* You haven't installed the library. Run `pip install mysql-connector-python`.


3. **Unwanted Index Column**:
* *Issue:* Your CSV/SQL table has an extra column with 0, 1, 2...
* *Fix:* You forgot to write `index=False` inside `to_csv` or `to_sql`.


4. **`InterfaceError`**:
* *Fix:* Wrong username, password, or database name in the connection string.



---

## 6. 🎯 Exam-Oriented Short Notes (Cheat Sheet)

* **`sep` argument:** Change this if your file uses tabs (`\t`) instead of commas.
* **`header=None`:** Mandatory if your CSV file does not have column titles.
* **`index=False`:** Always use this when saving data unless you specifically need the row numbers.
* **`if_exists='replace'`**: Be careful! This deletes the old SQL table and creates a new one. Use `'append'` to add data safely.
* **SQLAlchemy:** While `mysql.connector` is great for reading, `SQLAlchemy` is preferred for `to_sql` to handle data types automatically.