Data loading, Storage and File Formats
Assessing data is the first step in translating data into meaningful information. Pandas features a number of functions for reading tabular or dataframe data.
Last updated
Was this helpful?
Assessing data is the first step in translating data into meaningful information. Pandas features a number of functions for reading tabular or dataframe data.
Last updated
Was this helpful?
Data files can have different sources and different formats. Previously you learned to work with flat files such as files that contain sequence information. In this part of the programming course, we will work mainly with tabular data or data structures that can be easily transformed into a tabular format. Later on, you will learn to work with images and text.
Pandas has a number of methods for reading tabular data as a DataFrame object.
The methods reads the data directly into a Pandas DataFrame. Most of these methods have options to skip NaN values, read a specific part of the file by defining a number of rows of or the chunk size or skip the footer.
Read the documentation to learn about the specifics. In this ebook, we look at the most commonly used in data science. The csv
, json
, xml
, html
and pdf
files.