This section provides an overview of the dataset, focusing on its structure, types of features, and key statistics. Understanding these aspects is crucial for subsequent analyses and modeling.
We begin by loading the processed dataset to assess its structure and content.
df = load_data(PROCESSED_DATA)
The dataset consists of a total of 8950 rows and 18 columns, indicating the breadth and depth of the data we will analyze.
df.shape
To gain insights into the dataset's overall structure, including the presence of missing values and memory usage, we use: