Search
Working with JSON APIs

Data

Outline

  • Python APIs with JSON
  • Examples with OpenNotify APIs

What are APIs?

Application Programming Interfaces

  • Commonly used to retrieve data from servers/sites
  • Twitter, Facebook, Reddit offer data through APIs

Why APIs for data?

  • Dynamic, changing data
  • Data of interest from a larger data set
  • Related computation to be done on the server side

Client-server communication with APIs:

  • Client (program) sends a request to a remote (web) server
  • Server processes the request and provides a response (with status and data)
  • Client (program) receives response and data
  • Client processes the data

APIs on the web:

  • Communicate using http or https protocols
  • APIs are hosted on web servers
    • They are essentially web (data) “pages”
    • And you can use browser or related tools to test the APIs
  • JSON is the common data format

Working with APIs

Python programming with web APIs:

  • Uses the requests library for http communication
  • Type of http request: GET, POST, DELETE…
  • Most common to retrieve data: GET requests
  • Request sent to an endpoint, that is a server route used to retrieve data from the API

Example web APIs:

  • OpenNotify API: http://open-notify.org/
  • Using GET request to retrieve information about the international space station
  • Multiple endpoints for different data

Example endpoint: iss-now.json

Example Python request to OpenNotify API:

import requests
# Make a get request to get the latest position of 
# the international space station from the opennotify api.
response = requests.get("http://api.open-notify.org/iss-now.json")

# Print the status code of the response.
print(response.status_code)
200

Standard status codes in http:

  • 200 -- everything okay
  • 301 -- server redirecting you to a different endpoint
  • 401 -- not authenticated.
  • 400 -- a bad request.
  • 403 -- access is forbidden
  • 404 -- the resource not found on the server

Request to a non-existent endpoint:

# the endpoint/url does not exist 
response = requests.get("http://api.open-notify.org/iss-pass")
print(response.status_code)
404

Try the URL in your browser. Why did you get?

Another try:

response = requests.get("http://api.open-notify.org/iss-pass.json")
print(response.status_code)
400

This hits an available endpoint, but with incorrect parameters (a bad request).

Python Working with APIs

  • Connect to an available endpoint (URL)
  • Provide required parameters in the request
  • Proper authentication, if necessary

For documentation of the OpenNotify API http://open-notify.org/Open-Notify-API/ISS-Pass-Times/

Documentation shows that the iss-pass.json:

  • Returns when (time) the ISS will pass over a given location on earth
  • Requires two parameters about the location
    • lat: the latitude of the location
    • lon: the longitude of the location

Correct endpoint with required parameters

# Set up the parameters we want to pass to the API.
# This is the latitude and longitude of New York City.
parameters = {"lat": 40.71, "lon": -74}

# Make a get request with the parameters.
response = requests.get("http://api.open-notify.org/iss-pass.json", params=parameters)

# Parse the JSON content of the response, from server
print(response.content)
b'{\n  "message": "success", \n  "request": {\n    "altitude": 100, \n    "datetime": 1589830175, \n    "latitude": 40.71, \n    "longitude": -74.0, \n    "passes": 5\n  }, \n  "response": [\n    {\n      "duration": 231, \n      "risetime": 1589840955\n    }, \n    {\n      "duration": 631, \n      "risetime": 1589846510\n    }, \n    {\n      "duration": 635, \n      "risetime": 1589852315\n    }, \n    {\n      "duration": 567, \n      "risetime": 1589858203\n    }, \n    {\n      "duration": 585, \n      "risetime": 1589864065\n    }\n  ]\n}\n'

The output data look messy, difficult to read. Let's check the response header, especially on the content-type field, to see what format is used for the returned data from server:

response = requests.get("http://api.open-notify.org/iss-pass.json", params=parameters)
print(response.headers)
print("Content type of server response is:", response.headers["content-type"])
{'Server': 'nginx/1.10.3', 'Date': 'Mon, 18 May 2020 19:51:20 GMT', 'Content-Type': 'application/json', 'Content-Length': '519', 'Connection': 'keep-alive', 'Via': '1.1 vegur'}
Content type of server response is: application/json

Now that it is application/json, there are a couple of methods to parse the data as JSON:

import json
from pprint import pprint

# parse response content to JSON
data = json.loads(response.content)
print("json.loads: ")
pprint(data)

# OR, use the response.json() to get JSON directly
same_data = response.json()
print("response.json:")
pprint(same_data)
json.loads: 
{'message': 'success',
 'request': {'altitude': 100,
             'datetime': 1589830175,
             'latitude': 40.71,
             'longitude': -74.0,
             'passes': 5},
 'response': [{'duration': 231, 'risetime': 1589840955},
              {'duration': 631, 'risetime': 1589846510},
              {'duration': 635, 'risetime': 1589852315},
              {'duration': 567, 'risetime': 1589858203},
              {'duration': 585, 'risetime': 1589864065}]}
response.json:
{'message': 'success',
 'request': {'altitude': 100,
             'datetime': 1589830175,
             'latitude': 40.71,
             'longitude': -74.0,
             'passes': 5},
 'response': [{'duration': 231, 'risetime': 1589840955},
              {'duration': 631, 'risetime': 1589846510},
              {'duration': 635, 'risetime': 1589852315},
              {'duration': 567, 'risetime': 1589858203},
              {'duration': 585, 'risetime': 1589864065}]}

Given the JSON data structure, we can access specific data elements such as message:

print(data['message'])
success

Twitter API

Tweet, 140-character long text message with

  • Date and time, links
  • user mentions @, hash tags #
  • retweets, locale, favorites, geocode

Twitter API

Access Twitter API:

  • Twitter applications need OAuth for autentication
  • Token-based authentication to OAuth

Create an application for access token:

  1. Consumer key
  2. Consumer secret
  3. Access token
  4. Access secret

Python Twitter API (twython)

  • Install the twython module with pip (SIMPLE):
pip install twython

More on Twitter API

References