instagrapi

Run instagrapi tasks in Celery: queues, retries, rate limits

Maintained by the instagrapi contributors · Library on GitHub

Updated

You have an instagrapi-shaped workload that does not belong in a request path — a daily follower-count snapshot, a webhook that posts a comment in response to a Stripe event, a competitor-mentions sweep that runs every thirty minutes — and Celery is the obvious queue. The first iteration is rarely wrong on its own: a tasks.py with a single @app.task that constructs a Client(), calls cl.login(), makes one IG call, and returns. It works on a developer laptop, it works on staging with one worker, and it starts producing odd failures the moment a second worker comes online. Tasks that ran clean for a week begin returning challenge_required for no obvious reason. The retry log fills with please_wait_a_few_minutes warnings. Once a feedback_required lands, every subsequent retry on that account makes the suppression worse rather than better.

The cause is the same shape Django and FastAPI integrations hit: instagrapi expects a single long-lived session, a single device fingerprint, and a single rate-limit budget per account. Celery, by default, gives you the opposite — fresh process state per task, an aggressive retry policy that does the wrong thing for account-level errors, and N parallel workers all chewing through the same per-account quota. This page walks the Celery-specific way to put instagrapi on a queue without those three failure modes: load and dump the session per task with a Redis-backed cookie jar, scope retries to genuinely transient errors, and gate every IG call through a Redis token bucket so worker count does not multiply the rate-limit budget.

Setup

Install Celery with the Redis broker, instagrapi, and a Redis client. Redis pulls double duty here: broker for Celery, backing store for the shared session blob, and the home for the per-account token bucket. A single Redis is fine for small deployments; split brokers from data once tasks-per-second crosses a few hundred.

pip install 'celery[redis]' instagrapi redis

Wire the Celery app and the IG task in one place. Crucially, scope autoretry_for to PleaseWaitFewMinutes only — never list FeedbackRequired or ChallengeRequired in the retry tuple, because both are account-level signals that retries make worse.

# tasks.py
from celery import Celery
from instagrapi import Client
from instagrapi.exceptions import PleaseWaitFewMinutes
import json, redis

app = Celery('ig', broker='redis://localhost:6379/0', backend='redis://localhost:6379/0')
r = redis.Redis()

@app.task(autoretry_for=(PleaseWaitFewMinutes,), retry_backoff=600,
          retry_backoff_max=3600, max_retries=3)
def fetch_user(username: str):
    cl = Client()
    blob = r.get('ig:session:main')
    if blob:
        cl.set_settings(json.loads(blob))
    cl.login(username='...', password='...')
    user = cl.user_info_by_username(username)
    r.set('ig:session:main', json.dumps(cl.get_settings()))
    return {'pk': user.pk, 'username': user.username}

The credentials in the example are placeholders — keep them in environment variables and never bake them into the image. Treat the session blob as a write-back cache: load before login, dump after every successful call so the post-call cookie state is the next task’s starting state.

Working example

The minimum production-shaped Celery example is a beat-scheduled follower sync that loads a session, runs one IG call, persists the session back, and writes results to a database — with a distributed lock so a slow run does not collide with the next scheduled invocation.

# tasks.py (continued)
from celery.schedules import crontab

app.conf.beat_schedule = {
    'sync-followers-daily': {
        'task': 'tasks.sync_followers',
        'schedule': crontab(hour=3, minute=0),
        'args': ('instagram',),
    },
}

@app.task(autoretry_for=(PleaseWaitFewMinutes,), retry_backoff=600, max_retries=3)
def sync_followers(username: str):
    lock = r.set(f'ig:lock:{username}', '1', nx=True, ex=3600)
    if not lock:
        return {'skipped': 'already running'}
    try:
        cl = _load_client()
        target = cl.user_id_from_username(username)
        followers = cl.user_followers(target, amount=500)
        _persist_session(cl)
        # write to DB here — idempotent upserts only
        return {'count': len(followers)}
    finally:
        r.delete(f'ig:lock:{username}')

Three details make this snippet survive contact with production. The SETNX lock with a one-hour TTL prevents the next scheduled run from starting if the previous one is still walking pagination — without it, two parallel user_followers calls on the same account double the rate-limit consumption and trip please_wait_a_few_minutes. The try/finally ensures the lock is released even when the task crashes; an orphaned lock would silently skip every subsequent run for an hour. The DB write path is idempotent (upserts, not inserts) because Instagram pagination occasionally re-yields the same follower across page boundaries, especially when the target account is gaining followers during the walk.

Production caveats

Three patterns repeatedly break Celery + instagrapi integrations once they leave a single-worker dev environment. They are roughly in order of how often they cost a team an afternoon.

1. Per-task session reuse vs per-task fresh login

The naive task constructs Client() and calls cl.login() on every invocation. In a queue running thousands of tasks per day, that is thousands of fresh logins from the same account — and Instagram’s risk model treats the pattern as either a credential-stuffing attempt or a hijacked account being abused, depending on whether the device fingerprint also rotates. Either reaction starts with a challenge_required and ends with the account suspended. The fix is the load-then-dump pattern from the working example: read the session blob from Redis before login, call login (which becomes a no-op if the cookies are still valid), and write the post-call settings back. The blob is a few kilobytes; round-tripping it through Redis on every task adds sub-millisecond overhead.

2. Distributed rate-limit budget

Celery’s whole appeal is horizontal scale — add workers, run more tasks per second. instagrapi’s rate-limit budget does not scale that way. Instagram budgets requests per account, and four workers running flat out against one account exhaust the per-hour quota four times faster than a single worker would. The integration starts hitting please_wait_a_few_minutes at roughly the size of (intended budget) / (worker count), and the retry policy then turns that into a thundering herd of retries that all hit the same throttled account. The fix is a Redis-backed token bucket keyed by account: every task acquires a token before its IG call and waits if the bucket is empty. Tune the refill rate down until please_wait_a_few_minutes stops appearing in the worker log; that is the real per-account budget for that account’s age and warm-up state.

3. Retry storms on feedback_required

FeedbackRequired is the failure mode that hurts the most when retried. It is account-level: Instagram has flagged the account for spammy behaviour and wants you to stop. Every retry within the suppression window deepens the flag and stretches the recovery time. The default autoretry_for=(Exception,) shape — which a developer adds when chasing flakiness — turns a single feedback_required into ten retries spaced minutes apart, which is exactly the pattern that escalates a soft block to a hard one. Scope retries narrowly: PleaseWaitFewMinutes is fine to retry with backoff because it is rate-limit, ClientNetworkError is fine to retry because it is transport, but FeedbackRequired and ChallengeRequired must route to a dead-letter queue and an alert that pages a human, not back into the worker pool.

Fix in instagrapi

Four steps, in order — each one assumes the previous one is in place.

  1. Load and dump the session per task. Wrap the load/dump in a small helper so every task uses it consistently. The helper reads ig:session:<account> from Redis, calls set_settings() if a blob exists, runs login() (which becomes a no-op when the cookies are valid), and dumps get_settings() back to Redis after the IG call returns. Pair the load with the same SETNX refresh lock the Django integration uses so two workers do not race each other into a double-login on cold cache.

  2. Token-bucket the per-account budget in Redis. Implement a sliding-window or fixed-bucket counter keyed by account name. Every task decrements before its IG call and either waits or raises a soft retry if the budget is empty. Start with a conservative bucket size — 200 calls per hour for a fresh account, 600 for a hardened one — and tune the refill rate down until please_wait_a_few_minutes disappears from the log. The bucket is a global property of the account, not the worker, so it must live outside the worker process.

  3. Scope autoretry_for correctly. The retry tuple should contain only transient transport and rate-limit errors. Read the please_wait_a_few_minutes reference for the right backoff shape. Add feedback_required (see the feedback_required reference) and challenge_required to a non-retry list that escalates to a dead-letter queue.

  4. Use Celery beat with distributed locks for schedules. Periodic tasks (daily syncs, hourly mention sweeps) need a SETNX lock on the per-account key so a slow run cannot collide with the next scheduled invocation. The lock TTL should be slightly longer than the worst-case task duration; release it in a finally so a crashed task does not orphan the lock for the full TTL.

Deep dive

The retry policy interacts with Celery’s task time limit in a way that is easy to misconfigure. A task with time_limit=300 and autoretry_for=(PleaseWaitFewMinutes,) plus retry_backoff=600 will be killed by Celery’s own time limit before the first backoff window elapses, and the retry will never schedule. The fix is to keep the task body short — make every task exactly one IG call and one DB write — so the time limit never overlaps with the retry-backoff window. If a logical operation needs many IG calls (a deep follower walk, a multi-shortcode lookup), break it into a chain of small tasks; the Celery chain and group primitives are designed for exactly this shape, and they let each link of the chain own its own retry policy and rate-limit token. The feedback_required quarantine path is also worth wiring to Sentry or a similar exception tracker — the alert needs to land in front of a human within minutes, because the recovery action is to stop sending traffic to that account, and only a human can decide whether to switch to a backup account or just wait.

Related integrations

Related errors

Related guides

Frequently asked

Why use Celery for instagrapi instead of just calling it directly?

Three reasons: long IG calls don't block your web request workers; retries are automatic on transient errors (please_wait_a_few_minutes, ClientError); and Celery scales horizontally — add workers to scale throughput, throttle by rate-limit-aware tokens.

How does Celery handle instagrapi sessions across tasks?

The session lives in shared storage (Redis recommended). Each task loads the session, makes the IG call, dumps any updated cookies back. The session blob is small (a few KB); locking is the only complication when multiple workers refresh simultaneously.

What's the right Celery retry policy for please_wait_a_few_minutes?

autoretry_for=(PleaseWaitFewMinutes,) with retry_backoff=600 (start at 10 minutes), retry_backoff_max=3600 (cap at 1 hour), max_retries=3. After 3 retries, escalate to a dead-letter queue and alert.

Should I use Celery beat for scheduled instagrapi tasks?

Yes — Celery beat is the right primitive for periodic IG syncs (every-30-min mention checks, daily follower counts). Use distributed locks (redis SETNX) to prevent overlapping schedules from running the same task twice.

Skip the infra?

Managed Instagram API — same endpoints, sessions and proxies handled.

Try HikerAPI → Full comparison
More from the team