All platforms

Eightfold Jobs API.

AI-powered talent intelligence platform for enterprises with two API patterns: SmartApply and PCSX.

Eightfold
Live
150K+jobs indexed monthly
<3haverage discovery time
1hrefresh interval
Companies using Eightfold
American ExpressStarbucksPayPalModernaNutanix
Developer tools

Try the API.

Test Jobs, Feed, and Auto-Apply endpoints against https://connect.jobo.world with live request/response examples, then copy ready-to-use curl commands.

What's in every response.

Data fields, real-world applications, and the companies already running on Eightfold.

Data fields
  • AI-powered matching
  • Enterprise coverage
  • Skills data
  • Career path info
  • Diversity focus
  • Two API versions
  • Sitemap discovery
Trusted by
American ExpressStarbucksPayPalModernaNutanixDolbyMicrosoft
DIY GUIDE

How to scrape Eightfold.

Step-by-step guide to extracting jobs from Eightfold-powered career pages—endpoints, authentication, and working code.

RESTintermediate~100 requests/minute (unofficial)No auth

Detect the API type and fetch job listings

Eightfold uses two API patterns (SmartApply and PCSX). Try both endpoints to determine which one the company uses, then fetch the job listings.

Step 1: Detect the API type and fetch job listings
import requests

company = "aexp"  # American Express
domain = "aexp.com"
base_url = f"https://{company}.eightfold.ai"

# Try SmartApply API first
smartapply_url = f"{base_url}/api/apply/v2/jobs?domain={domain}&hl=en&start=0"
headers = {"Accept": "application/json", "Content-Type": "application/json"}

response = requests.get(smartapply_url, headers=headers)
if response.status_code == 200:
    data = response.json()
    jobs = data.get("positions", [])
    total = data.get("totalJobs", 0)
    print(f"SmartApply: Found {total} jobs, showing {len(jobs)}")
else:
    # Try PCSX API as fallback
    pcsx_url = f"{base_url}/api/pcsx/search?domain={domain}&query=&location=&start=0"
    response = requests.get(pcsx_url, headers=headers)
    if response.status_code == 200:
        data = response.json().get("data", {})
        jobs = data.get("positions", [])
        print(f"PCSX: Found {len(jobs)} jobs")

Parse the sitemap for complete job discovery

The sitemap contains all job URLs without pagination limits. This is the most reliable way to get a complete list of job IDs.

Step 2: Parse the sitemap for complete job discovery
import requests
import re
import xml.etree.ElementTree as ET

company = "aexp"
domain = "aexp.com"
sitemap_url = f"https://{company}.eightfold.ai/careers/sitemap.xml?domain={domain}"

response = requests.get(sitemap_url)
root = ET.fromstring(response.content)

# Extract job IDs from URLs
job_ids = []
namespace = {"ns": "http://www.sitemaps.org/schemas/sitemap/0.9"}

for url in root.findall(".//ns:url/ns:loc", namespace):
    match = re.search(r"/careers/job/(d+)", url.text)
    if match:
        job_ids.append(match.group(1))

print(f"Found {len(job_ids)} job IDs in sitemap")

Fetch job details for each listing

The listings API does not include job descriptions. You must fetch each job's details separately using the appropriate details endpoint.

Step 3: Fetch job details for each listing
import requests
import time

company = "aexp"
domain = "aexp.com"
job_id = "39751247"
base_url = f"https://{company}.eightfold.ai"
headers = {"Accept": "application/json", "Content-Type": "application/json"}

# SmartApply details endpoint
details_url = f"{base_url}/api/apply/v2/jobs/{job_id}?domain={domain}&hl=en"
response = requests.get(details_url, headers=headers)

if response.status_code == 200:
    job = response.json()
    print({
        "id": job.get("id"),
        "title": job.get("name"),
        "location": job.get("location"),
        "department": job.get("department"),
        "description": job.get("job_description", "")[:200] + "...",
        "url": job.get("canonicalPositionUrl"),
    })

# For PCSX, use: /api/pcsx/position_details?position_id={job_id}&domain={domain}

Handle pagination for large job boards

When using the API instead of sitemap, paginate through results by incrementing the start parameter by 10 for each request.

Step 4: Handle pagination for large job boards
import requests
import time

company = "aexp"
domain = "aexp.com"
base_url = f"https://{company}.eightfold.ai"
headers = {"Accept": "application/json", "Content-Type": "application/json"}

all_jobs = []
start = 0
batch_size = 10

while True:
    url = f"{base_url}/api/apply/v2/jobs?domain={domain}&hl=en&start={start}"
    response = requests.get(url, headers=headers)

    if response.status_code != 200:
        break

    data = response.json()
    positions = data.get("positions", [])

    if not positions:
        break

    all_jobs.extend(positions)
    print(f"Fetched {len(positions)} jobs (total: {len(all_jobs)})")

    start += batch_size
    time.sleep(0.5)  # Rate limiting

print(f"Total jobs collected: {len(all_jobs)}")

Handle custom domains and edge cases

Some companies use custom domains instead of the standard eightfold.ai subdomain. Detect these and extract the correct domain parameter.

Step 5: Handle custom domains and edge cases
import requests
from urllib.parse import urlparse

def get_domain_from_url(careers_url: str) -> tuple[str, str]:
    """Extract company slug and domain from careers URL."""
    parsed = urlparse(careers_url)
    host = parsed.netloc

    if "eightfold.ai" in host:
        # Standard subdomain format: {company}.eightfold.ai
        company = host.split(".")[0]
        domain = f"{company}.com"
    else:
        # Custom domain format: jobs.{company}.com
        parts = host.split(".")
        if parts[0] in ["jobs", "careers", "apply"]:
            domain = ".".join(parts[-2:])
            company = parts[-2]
        else:
            domain = host
            company = parts[0]

    return company, domain

# Examples
company, domain = get_domain_from_url("https://aexp.eightfold.ai/careers")
print(f"Company: {company}, Domain: {domain}")  # Company: aexp, Domain: aexp.com

company, domain = get_domain_from_url("https://jobs.arcadis.com/careers")
print(f"Company: {company}, Domain: {domain}")  # Company: arcadis, Domain: arcadis.com
Common issues
highTwo different API patterns exist (SmartApply vs PCSX)

Try the SmartApply endpoint first (/api/apply/v2/jobs). If it returns 404, fall back to the PCSX endpoint (/api/pcsx/search). Some companies use one, some use the other.

highListings API returns empty job descriptions

The listings endpoint does not include job descriptions. You must make a separate request to the job details endpoint for each job to get the full description HTML.

mediumWrong domain parameter causes 404 errors

The domain parameter must match the company's configured domain. Extract it from the sitemap URL in robots.txt or try common patterns like {subdomain}.com.

mediumCustom domains require special handling

Some companies use custom domains (e.g., jobs.arcadis.com instead of arcadis.eightfold.ai). Maintain a mapping of custom domains to their Eightfold parameters.

mediumRate limiting on high-volume requests

Add delays between requests (500ms minimum). When fetching many job details, use the sitemap approach first to get all IDs, then fetch details at a controlled rate.

lowSitemap job count differs from API totalJobs

Sitemap may be cached or updated on different schedules. Use the sitemap for discovery and API totalJobs as a validation check. Small discrepancies are normal.

Best practices
  1. 1Use the sitemap for complete job discovery without pagination
  2. 2Try SmartApply API first, fall back to PCSX if unavailable
  3. 3Fetch job details separately since listings do not include descriptions
  4. 4Add 500ms delay between requests to avoid rate limiting
  5. 5Extract domain parameter from robots.txt sitemap URL
  6. 6Handle both standard eightfold.ai subdomains and custom domains
Or skip the complexity

One endpoint. All Eightfold jobs. No scraping, no sessions, no maintenance.

Get API access
cURL
curl "https://enterprise.jobo.world/api/jobs?sources=eightfold" \
  -H "X-Api-Key: YOUR_KEY"
Ready to integrate

Access Eightfold
job data today.

One API call. Structured data. No scraping infrastructure to build or maintain — start with the free tier and scale as you grow.

99.9%API uptime
<200msAvg response
50M+Jobs processed