Describe how to perform authentication.

User Session Authentication

You can request API by being authenticated as a real OnCrawl user. You first need to request POST /api/session to retrieve a session token that will be valid during one week.

The Python code snippets below performs such authenticate and then list the user’s projects:

import requests
# Authenticate
resp = requests.post('https://app.oncrawl.com/api/session',
                     json=dict(identification='USERNAME',
                               password='PASSWORD'))
token = resp.json()['session']['token']
# List projects
resp = requests.get('https://app.oncrawl.com/api/projects',
                    headers={'x-oncrawl-token': token})
for project in resp.json()['projects']:
    print '{id}: {start_url}'.format(**project)

Below what the output of this program may be:

test_project: http://www.cultura.com/
test_project_coc: http://www.oncrawl.com
test_project_ga: http://www.oncrawl.com
test_project_cf: http://www.reezocar.com/

User Session Token Reusability

A user session token a valid during one week. For your tests, you can for instance store the token in a file to prevent authenticating over and over again. Below is the same example than above, but with more class:

import requests


def get_session(force=False):
    token_file = '.oncrawl-token.txt'
    session = requests.session()
    token = None
    if not force:
        try:
            with open(token_file) as istr:
                token = istr.read().rstrip()
        except IOError:
            pass
    if token is None:
        # Authenticate
        resp = session.post('https://app.oncrawl.com/api/session',
                            json=dict(identification='USERNAME',
                                      password='PASSWORD'))
        token = resp.json()['session']['token']
        with open(token_file, 'w') as ostr:
            ostr.write(token)
    session.headers['x-oncrawl-token'] = token
    return session


def list_projects(session):
    resp = session.get('https://app.oncrawl.com/api/projects')
    for project in resp.json()['projects']:
        print '{name}: {start_url}'.format(**project)


if __name__ == '__main__':
    session = get_session()
    list_projects(session)

OAuth Token

Once you have obtained an OAuth token, it must be passed in the Authorization header of your requests:

curl -H "Authorization: Bearer OAUTH-TOKEN https://app.oncrawl.com/api/projects"

or in Python:

import requests
TOKEN = 'OAUTH-TOKEN'
# List projects
resp = requests.get('https://app.oncrawl.com/api/projects',
                    headers=dict(Authorization='Bearer ' + TOKEN)
for project in resp.json()['projects']:
    print '{id}: {start_url}'.format(**project)
Tags: guide