JSON (JavaScript Object Notation)

What is JSON?

A lightweight data format used to exchange information between systems. It stands for JavaScript Object Notation. It stores and transmits data strictly as text, organizing information into structured key-value pairs and ordered lists (arrays). 

 

What specific data types can be stored in a JSON file? 

JSON supports six basic data types: strings (text enclosed in double quotes), numbers (integers or floating-point), booleans (true or false), null (empty values), arrays (ordered lists of values enclosed in square brackets [ ]), and objects (unordered collections of key-value pairs enclosed in curly braces { }). It does not natively support dates, functions, or binary data. 

 

How does JSON differ structurally from XML? 

Both JSON and XML are used to structure and transfer data, but they use different syntaxes. XML uses a hierarchical structure based on opening and closing tags (markup). JSON relies on standard programming brackets and commas, making the files smaller in size and faster for machines to parse. Furthermore, JSON aligns directly with the data structures used in modern programming languages, eliminating the need to write complex parsing code. 

 

What are the main technical limitations or strict rules of JSON? 

JSON has a very strict syntax that must be followed perfectly. All text keys must be enclosed in double quotes; single quotes are invalid. A single missing comma or an extra comma at the end of a list will cause the parsing process to fail entirely. Additionally, the JSON specification does not allow for comments inside the file, meaning developers cannot add explanatory text directly within the data structure. 

 

How is JSON practically used in the field of Data Science? 

In data science, JSON is the primary format used to extract raw data from external Application Programming Interfaces (APIs) and NoSQL databases like MongoDB. For example, a data scientist retrieving real-time financial market data from a web API will receive the response as a JSON file. They will then use Python and the pandas library—specifically the pandas.read_json() function—to convert this hierarchical text data directly into a structured DataFrame. Once in DataFrame format, the data can be cleaned, mathematically transformed, and fed into machine learning algorithms.