JSON is a powerful and flexible data format. It is used for a variety of applications and databases to transfer information from one computer or application to another. It can be confusing when you see JSON in a spreadsheet initially since there will be more information in a cell than usual. But, once you understand the format it is easy to use. Sourcetable gives you a formula, PARSE()
, to convert JSON data into a useable text format for analysis in your spreadsheets.
What is JSON?
If you are not familiar with JSON, it stands for JavaScript Object Notation. It is a lightweight data transfer format used for sending data over the internet from databases like MongoDB or from an API call.
JSON consists of data objects in key value pairs. Keys are like field names or column names, and the values can be integers, floats, strings, arrays, or another nested object. In the example above the sales customer column data is a json object in the spreadsheet. The JSON object begins and ends in curly braces:
JSON Data Types
The keys are on the left side of the ellipses and the value is on the right side. The object has the following properties: gender, satisfaction, age and email. The object can be broken down further into key value pairs such as "gender":"M"
which is a key value pair, meaning they are linked in the database. JSON objects can have many data types:
- Number - either an integer or floating point number
- String - a sequence of unicode characters delimited with double quotation marks
- Boolean - true or false values
- null: an empty value, using the word null
- Array - using a square bracket notation[] with comma separated parts it is an ordered list with as little as zero parts but usually more
- Object - a set of key-value pairs with the keys being strings. Keys do not have to be unique and objects are delimited with curly brackets
{ "gender":"M","satisfaction":3, "Age":49, "email":"jegkeol@tuzib.tl" }
Arrays in JSON
Arrays are extremely helpful when working with data. It is an ordered list, starting at zero using data of the same type. Usually you will see arrays in JSON data where there is a key with multiple values in an array:
{ "name": ["Mei", "Justin", "Felicia", "Maria"] }
. Arrays allow you to access each value individually.Why do I need to parse json data?
Parsing JSON objects is necessary to analyze the data they contain. If you would like to analyze the parts of a JSON object then you will need to parse the data and separate it into its own columns rather than leave the data in the JSON format. For example if you are trying to count the number of customer email addresses that you have but your data is in the JSON format above then you will need to separate (parse) the data to create the sum calculation since the computer can only access that JSON as a single string.
How do I parse json data?
Parsing json data is easy in Sourcetable. All that you need is the PARSE function and the key you want to display. Below you can see the JSON object on the left under the Sales column and the data for email in being separated and displayed in the email column with the formula
=PARSE($A2, “[‘email’]”)
and structured referencing can be used in conjunction with the PARSE formula. The parse function is essentially,=PARSE(cell containing JSON object, “[‘key name’]”)
.Parsing the JSON object using structured referencing with a data table named sales in the column named customers you would have this formula,
=PARSE(sales[customer],"['gender']")
.Parsing JSON arrays are a little more complicated but still simple. A JSON array is formatted like this:
[ {"external_url":"http://mashpedia.com/Intel","title":"Intel at Mashpedia"},{"external_url":"http://www.ultrabooks-test.net/", "title":"Intel Ultrabooks at Ultrabooks-Test.net"}, {"external_url":"http://techcrunch.com/tag/ultrabooks/", "title":"Techcrunch on Intel Ultrabooks"} ]
If you wanted to create new columns for "title":"Techcrunch on Intel Ultrabooks" from the JSON, like the array above, you would need to put in the formula
=PARSE(companies[external_links], "[2]['title']")
.The square brackets
[]
indicate an array and when you want to select part of the array to be in a new column you have to choose the order of the item you want from the array. The order starts at zero and the array would have the following order:[0,1,2]
. If you wanted to obtain the first complete item in list, which in this case is a JSON object, then you would use this formula:=PARSE(A2,"[0]")
The structure of JSON means that as long as you know the order of the array and the key then you can parse the data in Sourcetable.
Conclusion
JSON looks confusing at first but once you understand its structure you can use the Parse formula to easily analyze JSON in Sourcetable. Simply use cell references or structured referencing along with the key names that you want to separate out like in this example:
=PARSE($A2, “[‘email’]”)
.Now you can analyze data contained in JSON! Ready to try Sourcetable?
🖥️Sign up now for a free trial.
"number":42
“string”
“boolean”:TRUE
"empty":null
“customers”:[
{"gender":"M","satisfaction":3,"age":49,"email":"jegkeol@tuzib.tl"},
{"gender":"M","satisfaction":3,"age":28,"email":"dagu@es.fm"},
{"gender":"F","satisfaction":1,"age":50,"email":"da@sahwo.co"}
]