Installation¶

Install the library from the Python Package Index with pipenv.

pipenv install court-scraper

Upon installation, you should have access to the court-scraper tool on the command line. Use the --help flag to view available sub-commands:

court-scraper --help

Note

See the usage docs for details on using court-scraper on the command line and in custom scripts.

Default cache directory¶

By default, files downloaded by the command-line tool will be saved to the .court-scraper folder in the user’s home directory.

On Linux/Mac systems, this will be ~/.court-scraper/.

Customize cache directory¶

To use an alternate cache directory, set the below environment variable (e.g. in a ~/.bashrc or ~/.bash_profile configuration file):

export COURT_SCRAPER_DIR=/tmp/some_other_dir

Configuration¶

Many court sites require user credentials to log in or present CAPTCHAs that must be handled using a paid, third-party service (court-scraper uses Anti-captcha).

Sensitive information such as user logins and the API key for a CAPTCHA service should be stored in a YAML configuration file called config.yaml.

This file is expected to live inside the default storage location for scraped files, logs, etc.

On Linux/Mac, the default location is ~/.court-scraper/config.yaml.

This configuration file must contain credentials for each location based on a Place ID, which is a snake_case combination of state and county (e.g. ga_dekalb for Dekalb County, GA).

Courts with a common software platform that allow sharing of credentials can inherit credentials from a single entry.

Here’s an example configuration file:

# ~/.court-scraper/config.yaml
captcha_service_api_key: 'YOUR_ANTICAPTCHA_KEY'
platforms:
  # Mark a platform user/pass for reuse in multiple sites
  odyssey_site: &ODYSSEY_SITE
    username: 'user@example.com'
    password: 'SECRET_PASS'
# Inherit platform credentials across multiple courts
ga_chatham: *ODYSSEY_SITE
ga_dekalb: *ODYSSEY_SITE
ga_fulton: *ODYSSEY_SITE

# Or simply set site-specific attributes
ny_westchester:
  username: 'user2@example.com'
  password: 'GREAT_PASSWORD'

CAPTCHA-protected sites¶

court-scraper uses the Anti-captcha service to handle sites protected by CAPTCHAs.

If you plan to scrape a CAPTCHA-protected site, register with the Anti-captcha service and obtain an API key.

Then, add your API key to your local court-scraper configuration file as shown below:

# ~/.court-scraper/config.yaml
captcha_service_api_key: 'YOUR_API_KEY'

Once configured, you should be able to query CAPTCHA-protected sites currently supported by court-scraper.