JSON
JavaScript Object Notation
The JSON format was inspired by the object and array format used in the JavaScript language. But since Python was invented before JavaScript, Python’s syntax for dictionaries and lists influenced the syntax of JSON. So the format of JSON is nearly identical to a combination of Python lists and dictionaries. Here is a JSON encoding that is roughly equivalent to the simple XML format:
{"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
}}
Source: Python https://docs.python.org/3.6/library/json.html
a JSON object looks like a dictionary. It can have a dictionary as a value, and this can lead to a dictionary in a dictionary tree. It is derived from Javascript in which an object is described as a dictionary.
Load JSON
We can load JSON data into a python JSON object with the methodjson.load()
. This is especially handy if we want only certain keys of the JSON file. In the example above I want to make a DataFrame of the hits with a record for each ID. I load the entire JSON file and subtract the hits tree with the method pd.DataFrame.from_dict()
import json
f = open('sample.json')
data = json.load(f)
data
{'max_score': 5.9047804,
'took': 47,
'total': 288,
'hits': [{'_id': '8660',
'_score': 5.9047804,
'entrezgene': '8660',
'name': 'insulin receptor substrate 2',
'symbol': 'IRS2',
'taxid': 9606},
{'_id': '3667',
'_score': 5.812647,
'entrezgene': '3667',
'name': 'insulin receptor substrate 1',
'symbol': 'IRS1',
'taxid': 9606},
{'_id': '3651',
'_score': 5.288981,
'entrezgene': '3651',
'name': 'pancreatic and duodenal homeobox 1',
'symbol': 'PDX1',
'taxid': 9606}]}
I can investigate the structure of the JSON file by retrieving the keys. In the example below the key's max_score
, took
, total
and hits
are returned
print(data.keys())
dict_keys(['max_score', 'took', 'total', 'hits'])
It seems that the hits key contain interesting records we want to investigate further. These I will parse in a pandas DataFrame
import pandas as pd
df_data = pd.DataFrame.from_dict(data['hits'])
df_data
_id
_score
entrezgene
name
symbol
taxid
0
8660
5.904780
8660
insulin receptor substrate 2
IRS2
9606
1
3667
5.812647
3667
insulin receptor substrate 1
IRS1
9606
9
3651
5.288981
3651
pancreatic and duodenal homeobox 1
PDX1
9606
Write to JSON
We can also define a JSON object and write it to a JSON file. We use the method json.dump()
to do such
data = {"menu": {
"id": "file",
"value": "File",
"popup": {
"menuitem": [
{"value": "New", "onclick": "CreateNewDoc()"},
{"value": "Open", "onclick": "OpenDoc()"},
{"value": "Close", "onclick": "CloseDoc()"}
]
}
}}
with open('output.json', 'w') as f:
json.dump(data, f)
we can use https://jsonlint.com to validate the output
Mind you if you want to write a DataFrame to a JSON object you need to think about the orientation
data = df_data.to_json(orient='records')
with open('output.json', 'w') as f:
json.dump(data, f)
Last updated