Protobuf Messages¶
The structure of the API is described in WebSocket API. The details of all the individual messages are documened here. The client always sends a Request message. The server always sends a ServerMessage message, which contains either a Response to a request or an Event belonging to a subscription.
CaptchaSolver
Generic CAPTCHA solver.
Field | Type | Label | Description |
solver_id | bytes | optional | Unique identifier. |
name | string | optional | Name of CAPTCHA solver entry. |
created_at | string | optional | When this entry was created. (RFC-3339) |
updated_at | string | optional | When this entry was last updated. (RFC-3339) |
antigate | CaptchaSolverAntigate | optional | Extra configuration for Antigate CAPTCHA solver. |
CaptchaSolverAntigate
CAPTCHA solver for Antigate-compatible API.
Field | Type | Label | Description |
service_url | string | optional | The CAPTCHA solving URL endpoint. |
api_key | string | optional | An API key. |
require_phrase | bool | optional | Some CAPTCHAs use a phrase instead of a short sequence of characters. |
case_sensitive | bool | optional | Some CAPTCHAs are case sensitive. |
characters | CaptchaSolverAntigateCharacters | optional | CAPTCHAs may permit different sets of characters. |
require_math | bool | optional | Some CAPTCHAs require solving a match problem. |
min_length | int32 | optional | The minimum length of the CAPTCHA. |
max_length | int32 | optional | The maximum length of the CAPTCHA. |
CrawlResponse
Contains a crawl response, i.e. something downloaded by the crawler or the error/exception that resulted from attempting to download it.
Field | Type | Label | Description |
body | bytes | optional | The raw response body. |
completed_at | string | optional | When this response was finished downloading. (RFC-3339) |
content_type | string | optional | The MIME type of this content. |
cost | double | optional | This item's crawl cost. (See "Crawl Policy" for more details.) |
duration | double | optional | How long this item took to download |
exception | string | optional | The complete text of any exception that occurred while downloading. |
headers | Header | repeated | The HTTP headers for this response. |
is_compressed | bool | optional | If true, the response body is compressed. |
is_success | bool | optional | If true, the response is considered a success. If false, either an HTTP error or an exception occurred. |
job_id | bytes | optional | The identifier of the crawl job that this crawl response is associated with. |
started_at | string | optional | When the request for this item was sent. (RFC-3339) |
status_code | int32 | optional | The HTTP status code returned for this response. |
url | string | optional | The original request URL. |
url_can | string | optional | The canonicalized request URL. |
DomainLogin
Login information for a domain.
Field | Type | Label | Description |
domain | string | optional | The domain for which this login information is valid. |
login_url | string | optional | The URL for logging into this domain. |
login_test | string | optional |
|
users | DomainLoginUser | repeated | Credentials for this domain. |
DomainLoginUser
Username/password for a domain.
Field | Type | Label | Description |
username | string | optional | The username part of a credential. |
password | string | optional | The password part of a credential. |
working | bool | optional | If true, this credential is expected to work. If false, the credential is expected to fail. |
Event
An Event is sent for any active subscription whenever data is available for that subscription.
Field | Type | Label | Description |
subscription_id | int32 | required | A unique identifier for each subscription. |
job_list | JobList | optional | A list of crawl jobs. |
schedule_list | ScheduleList | optional | A list of schedules. |
resource_frame | ResourceFrame | optional | A snapshot of resource consumption at a point in time. |
subscription_closed | SubscriptionClosed | optional | Indicates that a subscription has ended. |
sync_item | SyncItem | optional | A single crawled item wrapped with metadata to assist synchronization. |
task_tree | TaskTree | optional | A snapshot of the Trio task's running at a point in time. |
Header
An HTTP header.
Field | Type | Label | Description |
key | string | optional | The name of the header. |
value | string | optional | The header's value. |
Job
A crawl job.
Field | Type | Label | Description |
job_id | bytes | required | The unique identifier for a crawl job. |
seeds | string | repeated | The list of seed URLs for a crawl job. |
policy | Policy | optional | The crawl policy associated with a crawl job. |
name | string | optional | The name assigned to this crawl job. |
tags | string | repeated | The tags associated with this crawl job. |
run_state | JobRunState | optional | The crawl job's current run state. |
started_at | string | optional | The password part of a credential. When this crawl job started. (RFC-3339) |
completed_at | string | optional | Which this crawl job ended. (RFC-3339) |
item_count | int32 | optional | The number of requests sent for this crawl, inclusive of HTTP errors and exceptions. Default: -1 |
http_success_count | int32 | optional | The number of successful responses for this crawl. Default: -1 |
http_error_count | int32 | optional | The number of HTTP error responses for this crawl. Default: -1 |
exception_count | int32 | optional | The number of requests that resulted in an exception during this crawl. Default: -1 |
http_status_counts | Job.HttpStatusCountsEntry | repeated | Counts of the number of times each HTTP status code has been received during this crawl. |
Job.HttpStatusCountsEntry
Field | Type | Label | Description |
key | int32 | optional |
|
value | int32 | optional |
|
JobList
A list of jobs.
This seemingly stupid message is necessary so that a list of jobs can be used as the body of a "oneof".
Field | Type | Label | Description |
jobs | Job | repeated |
|
Page
For paginating large sets, similar to LIMIT/OFFSET in SQL.
Field | Type | Label | Description |
limit | int32 | optional | The number of items to include on each page. Default: 10 |
offset | int32 | optional | The number of items to skip over before emitting items for the current page. |
PerformanceProfileFunction
Performance profile data for a single function.
Field | Type | Label | Description |
file | string | optional |
|
line_number | int32 | optional |
|
function | string | optional |
|
calls | int32 | optional |
|
non_recursive_calls | int32 | optional |
|
total_time | double | optional |
|
cumulative_time | double | optional |
|
Policy
Settings that dictate crawler behavior. Policy is set on a job-by-job basis.
Field | Type | Label | Description |
policy_id | bytes | optional | Unique identifier for this policy. |
name | string | optional | The name assigned to this policy. |
created_at | string | optional | When this policy was created. (RFC-3339) |
updated_at | string | optional | When this policy was last updated. (RFC-3339) |
captcha_solver_id | bytes | optional | The CAPTCHA sovler to use. |
authentication | PolicyAuthentication | optional | The authentication subpolicy controls how the crawler logs in to websites. |
limits | PolicyLimits | optional | The limits subpolicy controls how long the crawler runs. |
proxy_rules | PolicyProxyRule | repeated | The proxy subpolicy controls what proxies the crawler uses. |
mime_type_rules | PolicyMimeTypeRule | repeated | The MIME-type subpolicy controls what types of content the crawler downloads. |
robots_txt | PolicyRobotsTxt | optional | The robots.txt subpolicy controls how the crawler handles robots directives. |
url_normalization | PolicyUrlNormalization | optional | The normalization subpolicy controls how the crawler normalizes URLs in its crawl frontier. |
url_rules | PolicyUrlRule | repeated | The URL rules subpolicy controls how the crawler computes cost numbers for URLs in its crawl frontier. |
user_agents | PolicyUserAgent | repeated | The user agent subpolicy controls what user agent strings the crawler sends. |
PolicyAuthentication
Settings for authenticated crawling.
Field | Type | Label | Description |
enabled | bool | optional |
|
PolicyLimits
Specifies limits on how far or how long a crawl runs.
Field | Type | Label | Description |
max_cost | double | optional | Crawl items that exceed this cost will not be fetched. |
max_duration | double | optional | The crawl will end after this many seconds have elapsed. |
max_items | int32 | optional | Crawl will end after this many items have been downloaded. |
PolicyMimeTypeRule
Specifies whether to save or discard certain responses based on MIME type.
If pattern is not specified, then this rule applies to all responses.
Field | Type | Label | Description |
pattern | string | optional | A regex pattern to match against the MIME type. |
match | PatternMatch | optional | Allows to invert a pattern. |
save | bool | optional | If true, items with matching MIME types are saved and non-matches are discarded. If false, it's the opposite. |
PolicyProxyRule
When and how to proxy requests.
Field | Type | Label | Description |
pattern | string | optional | A regex pattern to match against the URL. |
match | PatternMatch | optional | Allows to invert a pattern. |
proxy_url | string | optional | The URL of the proxy to use when items match this rule. If not provided, then no proxy is used. |
PolicyRobotsTxt
Specify handling of robots.txt.
Field | Type | Label | Description |
usage | PolicyRobotsTxt.Usage | required |
|
PolicyUrlNormalization
Field | Type | Label | Description |
enabled | bool | optional | If true, URL normalization is applied to URLs in the crawl frontier. If false, URLs are not normalized. |
strip_parameters | string | repeated | A list of URL query parameters that are removed during normalization. |
PolicyUrlRule
A rule for adjusting a URL's cost.
Field | Type | Label | Description |
pattern | string | optional | A regex pattern to match against the URL. |
match | PatternMatch | optional | Allows to invert a pattern. |
action | PolicyUrlRule.Action | optional | The arithmetic operation to perform when a URL is matched. |
amount | double | optional | The right operand to the arithmetic operation. (The left operand is the cost of the parent item.) |
PolicyUserAgent
Specifies a user agent string to send when downloading a resource.
Field | Type | Label | Description |
name | string | required |
|
RateLimit
Model for a rate limit.
If domain is not specified, the global rate limit is modified. If delay is not specified, then the rate limit for the specified domain is deleted. Either delay or domain must be specified: you are not allowed to delete the global limit.
If the client sends a name, the server ignores it. The server will always send a name to the client.
Field | Type | Label | Description |
name | string | optional | The name of the rate-limit. (Read-only) |
delay | float | optional | The delay, in seconds, between requests associated with this rate-limit token. |
token | bytes | optional | The rate-limit token. (Read-only) |
domain | string | optional | The name of the domain that this rate limit applies to. |
Request
A Request is issued by the client, and the server is expected to send exactly 1 Response for each Request.
Field | Type | Label | Description |
request_id | int32 | required | The request ID is included in the response so that the client can correlate requests and responses. (Responses may arrive in a different order than the requests were sent.) |
delete_captcha_solver | RequestDeleteCaptchaSolver | optional |
|
get_captcha_solver | RequestGetCaptchaSolver | optional |
|
list_captcha_solvers | RequestListCaptchaSolvers | optional |
|
set_captcha_solver | RequestSetCaptchaSolver | optional |
|
delete_job | RequestDeleteJob | optional |
|
get_job | RequestGetJob | optional |
|
get_job_items | RequestGetJobItems | optional |
|
list_jobs | RequestListJobs | optional |
|
set_job | RequestSetJob | optional |
|
delete_schedule | RequestDeleteSchedule | optional |
|
get_schedule | RequestGetSchedule | optional |
|
list_schedules | RequestListSchedules | optional |
|
list_schedule_jobs | RequestListScheduleJobs | optional |
|
set_schedule | RequestSetSchedule | optional |
|
delete_policy | RequestDeletePolicy | optional |
|
get_policy | RequestGetPolicy | optional |
|
list_policies | RequestListPolicies | optional |
|
set_policy | RequestSetPolicy | optional |
|
delete_domain_login | RequestDeleteDomainLogin | optional |
|
get_domain_login | RequestGetDomainLogin | optional |
|
list_domain_logins | RequestListDomainLogins | optional |
|
set_domain_login | RequestSetDomainLogin | optional |
|
list_rate_limits | RequestListRateLimits | optional |
|
set_rate_limit | RequestSetRateLimit | optional |
|
performance_profile | RequestPerformanceProfile | optional |
|
subscribe_job_status | RequestSubscribeJobStatus | optional |
|
subscribe_job_sync | RequestSubscribeJobSync | optional |
|
subscribe_resource_monitor | RequestSubscribeResourceMonitor | optional |
|
subscribe_task_monitor | RequestSubscribeTaskMonitor | optional |
|
unsubscribe | RequestUnsubscribe | optional |
|
RequestDeleteCaptchaSolver
Delete a CAPTCHA solver.
Field | Type | Label | Description |
solver_id | bytes | optional |
|
RequestDeleteDomainLogin
Delete a credential and all of its passwords.
Field | Type | Label | Description |
domain | string | optional |
|
RequestDeleteJob
Delete a job and all of its items.
Field | Type | Label | Description |
job_id | bytes | required |
|
RequestDeletePolicy
Delete a policy.
Field | Type | Label | Description |
policy_id | bytes | required |
|
RequestDeleteSchedule
Delete a job schedule.
Field | Type | Label | Description |
schedule_id | bytes | required |
|
RequestGetCaptchaSolver
Get a CAPTCHA solver by ID.
Field | Type | Label | Description |
solver_id | bytes | required |
|
RequestGetDomainLogin
Get credential data for the specified domain.
Field | Type | Label | Description |
domain | string | required |
|
RequestGetJob
Get metadata for the specified job.
Field | Type | Label | Description |
job_id | bytes | required |
|
RequestGetJobItems
Get a list of items (crawl responses) from a job.
Field | Type | Label | Description |
job_id | bytes | required | Get items beloning to this job. |
include_success | bool | optional |
|
include_error | bool | optional |
|
include_exception | bool | optional |
|
compression_ok | bool | optional | If true, some items may have compressed response bodies. The caller should check the compression flag for each item and handle accordingly. Default: true |
page | Page | optional | Pagination options for job items. |
RequestGetPolicy
Get a policy.
Field | Type | Label | Description |
policy_id | bytes | required |
|
RequestGetSchedule
Get metadata for the specified job schedule.
Field | Type | Label | Description |
schedule_id | bytes | required |
|
RequestListCaptchaSolvers
Get a list of CAPTCHA solvers.
Field | Type | Label | Description |
page | Page | optional |
|
RequestListDomainLogins
Get a list of domain logins.
Field | Type | Label | Description |
page | Page | optional |
|
RequestListJobs
Get a list of jobs.
Field | Type | Label | Description |
page | Page | optional | Pagination options for the job list. |
started_after | string | optional | Only return jobs started after the given datetime. (RFC-3339) |
tag | string | optional | Only return jobs matching the given tag. |
schedule_id | bytes | optional | Only return jobs belonging to the given schedule. |
RequestListPolicies
Get a list of policies.
Field | Type | Label | Description |
page | Page | optional | Pagination options for policies. |
RequestListRateLimits
Show rate limits.
Field | Type | Label | Description |
page | Page | optional | Pagination options for rate limits. |
RequestListScheduleJobs
Get a list of jobs for a given schedules.
Field | Type | Label | Description |
schedule_id | bytes | required |
|
page | Page | optional | Pagination options for job schedules. |
RequestListSchedules
Get a list of job schedules.
Field | Type | Label | Description |
page | Page | optional |
|
RequestPerformanceProfile
Request a performance profile.
Field | Type | Label | Description |
duration | double | optional | The amount of time to spend collecting samples. Default: 5 |
sort_by | string | optional | Default: total_time |
top_n | int32 | optional |
|
RequestSetCaptchaSolver
Create or modify a CAPTCHA solver.
Field | Type | Label | Description |
solver | CaptchaSolver | optional |
|
RequestSetDomainLogin
Add or update metadata for a domain login.
Field | Type | Label | Description |
login | DomainLogin | optional |
|
RequestSetJob
Create a job or update state for an existing job.
Name, seeds, tags, and policy may only be specified for new jobs. Once a job is created, only the run state may be changed.
Field | Type | Label | Description |
job_id | bytes | optional | If specified, update the state of that job. Omit when creating new jobs. |
run_state | JobRunState | optional | Set the run state for the job. |
policy_id | bytes | optional | Set the policy for this job. (This is only applicable when creating a new job.) |
seeds | string | repeated | Set the seeds for this job. (This is only applicable when creating a new job.) |
name | string | optional | Set the name for this job. (This is only applicable when creating a new job.) |
tags | string | repeated | Set the tags for this job. (This is only applicable when creating a new job.) |
RequestSetPolicy
Create or update a crawl policy.
If the policy's ID is blank, then the server will create a new policy. If the ID is not blank, then the server will update the corresponding policy.
Field | Type | Label | Description |
policy | Policy | required |
|
RequestSetRateLimit
Set a rate limit.
Field | Type | Label | Description |
domain | string | optional | The domain to set the rate limit for. |
delay | float | optional | The delay, in seconds, between requests to this domain. |
RequestSetSchedule
Create or update a job schedule.
Field | Type | Label | Description |
schedule | Schedule | optional |
|
RequestSubscribeJobStatus
Subscribe to job status, e.g. run state, statistics, etc.
Field | Type | Label | Description |
min_interval | double | optional | The minimum amount of time between jobs status messages. If multiple job statuses change in rapid succession, those changes are coalesced into a single event. Default: 1 |
RequestSubscribeJobSync
Synchronize crawl responses.
Field | Type | Label | Description |
job_id | bytes | required | The job ID to subscribe to. |
sync_token | bytes | optional | If provided, resume an earlier sync subscription at the point represented by this token. (Tokens are obtained from previously received sync items.) |
compression_ok | bool | optional | If true, then some response bodies may be compressed. The caller should check the compression flag on each item and handle accordingly. Default: true |
RequestSubscribeResourceMonitor
Subscribe to resource monitoring, e.g. CPU usage, memory usage, downloads/sec, etc.
Field | Type | Label | Description |
history | int32 | optional | Default: 300 |
RequestSubscribeTaskMonitor
Subscribe to updates about resource usage (CPU, memory, disk, etc.)
Field | Type | Label | Description |
period | double | optional | The number of seconds in between task snapshots. Default: 3 |
RequestUnsubscribe
Close the specified subscription.
Field | Type | Label | Description |
subscription_id | int32 | required |
|
ResourceFrame
Data about resource consumption.
Field | Type | Label | Description |
timestamp | string | optional | When this snapshot was created. (RFC-3339) |
cpus | ResourceFrameCpu | repeated | Information about CPU usage. |
memory | ResourceFrameMemory | optional | Information about memory usage. |
disks | ResourceFrameDisk | repeated | Information about disk usage. |
networks | ResourceFrameNetwork | repeated | Information about network interface usage. |
jobs | ResourceFrameJob | repeated | Information about current crawl jobs. |
current_downloads | int32 | optional | The total number of in-flight requests made by the downloader. |
maximum_downloads | int32 | optional | The maximum number of in-flight requests permitted by the downloader. |
rate_limiter | int32 | optional | The number of items buffered inside the rate limiter. |
ResourceFrameCpu
CPU usage.
Field | Type | Label | Description |
usage | double | optional | CPU utilization as a percentaged (0.0-100.0) |
ResourceFrameDisk
Disk usage.
Field | Type | Label | Description |
mount | string | optional | The mountpoint for a file system. |
used | int64 | optional | The amount of space used on a file system, in bytes. |
total | int64 | optional | The total space on a file system, in bytes. |
ResourceFrameJob
Resources used by a crawl job.
Field | Type | Label | Description |
job_id | bytes | optional | The job's identifier. |
name | string | optional | The job's name. |
current_downloads | int32 | optional | The number of items this job is currently downloading. |
ResourceFrameMemory
Memory usage.
Field | Type | Label | Description |
used | int64 | optional | Memory usage in bytes. |
total | int64 | optional | The total amount of memory on the system, in bytes. |
ResourceFrameNetwork
Network usage.
Field | Type | Label | Description |
name | string | optional | The name of the network interface. |
sent | int64 | optional | The number of bytes sent on this interface. |
received | int64 | optional | The number of bytes received on this interface. |
Response
The server sends exactly one Response for each Request it receives.
Field | Type | Label | Description |
request_id | int32 | required | The request ID will match the request that prompted this response. |
is_success | bool | required | If true, this is a successful response to a request. If false, an error occurred. |
error_message | string | optional | If this is an error response, this field contains information about the error. |
solver | CaptchaSolver | optional |
|
new_solver | ResponseNewCaptchaSolver | optional |
|
list_captcha_solvers | ResponseListCaptchaSolvers | optional |
|
domain_login | DomainLogin | optional |
|
domain_login_user | DomainLoginUser | optional |
|
list_domain_logins | ResponseListDomainLogins | optional |
|
job | Job | optional |
|
new_job | ResponseNewJob | optional |
|
list_items | ResponseListItems | optional |
|
list_jobs | ResponseListJobs | optional |
|
schedule | Schedule | optional |
|
new_schedule | ResponseNewSchedule | optional |
|
list_schedules | ResponseListSchedules | optional |
|
list_schedule_jobs | ResponseListScheduleJobs | optional |
|
policy | Policy | optional |
|
new_policy | ResponseNewPolicy | optional |
|
list_policies | ResponseListPolicies | optional |
|
list_rate_limits | ResponseListRateLimits | optional |
|
new_subscription | ResponseNewSubscription | optional |
|
performance_profile | ResponsePerformanceProfile | optional |
|
ResponseListCaptchaSolvers
Return a list of CAPTCHA solvers.
Field | Type | Label | Description |
solvers | CaptchaSolver | repeated |
|
total | int32 | optional | The total number of solvers (not the count of solvers included in this response). |
ResponseListDomainLogins
Return a list of domain logins.
Field | Type | Label | Description |
logins | DomainLogin | repeated |
|
total | int32 | optional | The total number of domains (not the count of domains included in this response). |
ResponseListItems
Return a list of items (crawl responses) for a job.
Field | Type | Label | Description |
items | CrawlResponse | repeated |
|
total | int32 | optional | The total number of items (not the count of items included in this response). |
ResponseListJobs
Return a list of jobs.
Field | Type | Label | Description |
jobs | Job | repeated |
|
total | int32 | optional | The total number of jobs (not the count of jobs included in this response). |
ResponseListPolicies
Return a list of jobs.
Field | Type | Label | Description |
policies | Policy | repeated |
|
total | int32 | optional | The total number of jobs (not the count of jobs included in this response). |
ResponseListRateLimits
Return a list of rate limits.
Field | Type | Label | Description |
rate_limits | RateLimit | repeated |
|
total | int32 | optional | The total number of rate limits (not the count of rate limits included in this response). |
ResponseListScheduleJobs
Return a list of jobs for a given schedules.
Field | Type | Label | Description |
jobs | Job | repeated |
|
total | int32 | optional | The total number of jobs associated with this schedule (not just the ones included in this response). |
ResponseListSchedules
Return a list of job schedules.
Field | Type | Label | Description |
schedules | Schedule | repeated |
|
total | int32 | optional | The total number of job schedules (not just the ones included in this response). |
ResponseNewCaptchaSolver
A response containing the ID of a newly created CAPTCHA solver.
Field | Type | Label | Description |
solver_id | bytes | required |
|
ResponseNewJob
A response containing the ID of a newly created job.
Field | Type | Label | Description |
job_id | bytes | required |
|
ResponseNewPolicy
A response containing the ID of a newly created policy.
Field | Type | Label | Description |
policy_id | bytes | required |
|
ResponseNewSchedule
A response containing the ID of a newly created job schedule.
Field | Type | Label | Description |
schedule_id | bytes | required |
|
ResponseNewSubscription
A response containing the ID of a newly created subscription.
Field | Type | Label | Description |
subscription_id | int32 | required |
|
ResponsePerformanceProfile
Contains performance profile data.
Field | Type | Label | Description |
total_calls | int32 | optional |
|
total_time | double | optional |
|
functions | PerformanceProfileFunction | repeated |
|
Schedule
Schedule information for a job.
Field | Type | Label | Description |
schedule_id | bytes | optional | Unique identifier for this schedule. |
created_at | string | optional | When this schedule was created. (RFC-3339) |
updated_at | string | optional | When this job was last updated. (RFC-3339) |
enabled | bool | optional | If true, jobs will run as scheduled. If false, the schedule is ignored and no new jobs will be created. |
time_unit | ScheduleTimeUnit | optional | The unit of measurement referred to by `num_units`. |
num_units | int32 | optional | The number of time units that elapse between jobs. |
timing | ScheduleTiming | optional | Indicates how the timing of a new job is related to the timing of the previous job. |
schedule_name | string | optional | The name of this schedule. |
job_name | string | optional | The template to use when naming jobs created by this schedule. |
seeds | string | repeated | The seed URLs to use for jobs created by this schedule. |
policy_id | bytes | optional | The policy to use for jobs created by this schedule. |
tags | string | repeated | The tags to use for jobs created by this schedule. |
job_count | int32 | optional | The number of jobs created by this schedule. |
ScheduleList
A list of job schedules.
This seemingly stupid message is necessary so that a list of job schedules can be used as the body of a "oneof".
Field | Type | Label | Description |
schedules | Schedule | repeated |
|
ServerMessage
A wrapper for all server messages that contains either a response to a command or a subscription event.
Field | Type | Label | Description |
event | Event | optional |
|
response | Response | optional |
|
SubscriptionClosed
Sent when the server ends a subscription.
Field | Type | Label | Description |
reason | SubscriptionClosed.Reason | required |
|
message | string | optional |
|
SyncItem
An item sent when syncing a crawl job. One event is sent for each crawl response.
Field | Type | Label | Description |
item | CrawlResponse | required | The item downloaded by the crawl job. |
token | bytes | required | The sync token associated with this item. This can be used to resume a subscription later from the same point. |
TaskTree
A subtree of Trio tasks.
Field | Type | Label | Description |
name | string | optional | The name of the Trio task. |
subtasks | TaskTree | repeated | Child tasks belonging to this task. |
CaptchaSolverAntigateCharacters
Name | Number | Description |
ALPHANUMERIC | 1 | The CAPTCHA may contain any alphanumeric characters. |
NUMBERS_ONLY | 2 | The CAPTCHA contains numeric characters only. |
ALPHA_ONLY | 3 | The CAPTCHA contains alphabetic characters only. |
JobRunState
The various run states a job may have.
Name | Number | Description |
CANCELLED | 1 | The job was cancelled. |
COMPLETED | 2 | The job ran to completion. |
PAUSED | 3 | The job is paused but may be resumed in the future. |
PENDING | 4 | The job has been created but has not started running yet. |
RUNNING | 5 | The job is currently funning. |
DELETED | 6 | This isn't actually a valid run state. This state is sent to job status subscriptions when a job is deleted so that subscribers know that the job is gone. |
PatternMatch
Specifies how a regex pattern should be used.
Name | Number | Description |
MATCHES | 1 | |
DOES_NOT_MATCH | 2 |
PolicyRobotsTxt.Usage
Name | Number | Description |
OBEY | 1 | Obey robots.txt rules. |
INVERT | 2 | Do the opposite of robots.txt rules. |
IGNORE | 3 | Ignore robots.txt. |
PolicyUrlRule.Action
Name | Number | Description |
ADD | 1 | Add `amount` to parent cost. |
MULTIPLY | 2 | Multiply parent cost by `amount`. |
ScheduleTimeUnit
Define time units that may be used for scheduling a job.
Name | Number | Description |
MINUTES | 1 | |
HOURS | 2 | |
DAYS | 3 | |
WEEKS | 4 | |
MONTHS | 5 | |
YEARS | 6 |
ScheduleTiming
Define when a job should be scheduled.
Name | Number | Description |
AFTER_PREVIOUS_JOB_FINISHED | 1 | Schedule a job X units of time after the previous job completes. |
REGULAR_INTERVAL | 2 | Schedule a job every X units of time; if the previous job is still running, end it. |
SubscriptionClosed.Reason
Name | Number | Description |
COMPLETE | 1 | |
ERROR | 2 |
Scalar Value Types
.proto Type | Notes | C++ Type | Java Type | Python Type |
double | double | double | float | |
float | float | float | float | |
int32 | Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint32 instead. | int32 | int | int |
int64 | Uses variable-length encoding. Inefficient for encoding negative numbers – if your field is likely to have negative values, use sint64 instead. | int64 | long | int/long |
uint32 | Uses variable-length encoding. | uint32 | int | int/long |
uint64 | Uses variable-length encoding. | uint64 | long | int/long |
sint32 | Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int32s. | int32 | int | int |
sint64 | Uses variable-length encoding. Signed int value. These more efficiently encode negative numbers than regular int64s. | int64 | long | int/long |
fixed32 | Always four bytes. More efficient than uint32 if values are often greater than 2^28. | uint32 | int | int |
fixed64 | Always eight bytes. More efficient than uint64 if values are often greater than 2^56. | uint64 | long | int/long |
sfixed32 | Always four bytes. | int32 | int | int |
sfixed64 | Always eight bytes. | int64 | long | int/long |
bool | bool | boolean | boolean | |
string | A string must always contain UTF-8 encoded or 7-bit ASCII text. | string | String | str/unicode |
bytes | May contain any arbitrary sequence of bytes. | string | ByteString | str |