========
Overview
========
.. contents::
:local:
What is Antenna?
================
Antenna is the name of the collector for the Mozilla crash ingestion pipeline.
The processor, scheduled task runner, and webapp portions of the crash
ingestion pipeline are in `Socorro `__.
For more information about the crash ingestion pipeline and what the collector
does, see the `Socorro Overview
`_.
Purpose
=======
Antenna is the collector of the crash ingestion pipeline. It handles incoming
crash reports posted by crash reporter clients, generates a crash id which is
returned to the client, saves the information, and publishes a crash ids for
processing.
Requirements
------------
Antenna is built with the following requirements:
1. **Minimal dependencies**
Every dependency we add is another software cycle we have to track causing us
to have to update our code when they change.
2. **Make setting it up straight-forward**
Antenna should be straight-forward to set up. Minimal configuration options.
Good defaults. Good documentation.
3. **Easy to test**
Antenna should be built in such a way that it's easy to write tests for.
Tests that are easy to read and easy to write are easy to verify and this
will make it likely that the software is higher quality.
High-level architecture
=======================
Antenna is the collector of the crash ingestion pipeline.
.. image:: drawio/antenna_architecture.drawio.svg
Data flow
=========
This is the rough data flow:
1. Crash reporter client submits a crash report via HTTP POST with a
``multipart/form-data`` encoded payload.
See `Specification: Submitting Crash Reports
`__ for
details on format.
2. Antenna's ``BreakpadSubmitterResource`` handles the HTTP POST
request.
If the payload is compressed, Antenna uncompresses it.
Antenna extracts the payload.
Antenna throttles the crash report using a ruleset defined in the throttler.
If the throttler rejects the crash, collection ends here.
If the throttler accepts the crash, Antenna generates a crash id.
3. Then ``BreakpadSubmitterResource`` passes the data to the crashmover
to save and publish.
If crashstorage is ``GcsCrashStorage``, then the crashmover saves the crash
report data to Google Cloud Storage.
If the save is successful, then the crashmover publishes the crash report
id to the Google Cloud Pub/Sub standard queue topic for processing.
At this point, the HTTP POST has been handled, the crash id is sent to the
crash reporter client and the HTTP connection ends.
Diagnostics
===========
Collector-added fields
----------------------
Antenna adds several fields to the raw crash capturing information about
collection:
``metadata``
Holds additional properties of the crash report including how it was
structured and whether there were any problems with it.
``collector_notes``
Notes covering what happened during collection. This includes which fields
were removed from the raw crash.
``dump_checksums``
Map of dump name (e.g. ``upload_file_minidump``) to md5 checksum for that
dump.
``payload``
Specifies how the crash annotations were in the crash report. ``multipart``
means the crash annotations were encoded in ``multipart/form-data`` fields
and ``json`` means the crash annotations were in a JSON-encoded value in a
field named ``extra``.
``payload_compressed``
``1`` if the payload was compressed and ``0`` if it wasn't.
``submitted_timestamp``
The timestamp for when this crash report was collected in UTC in
``YYYY-MM-DDTHH:MM:SS.SSSSSS`` format.
``uuid``
The crash id generated for this crash report.
``version``
The raw crash schema version. Currently, this is 2.
Logs to stdout
--------------
In a production environment, Antenna logs to stdout in `mozlog format
`_.
You can see crashes being accepted and saved::
{"Timestamp": 1493998643710555648, "Type": "antenna.breakpad_resource", "Logger": "antenna", "Hostname": "ebf44d051438", "EnvVersion": "2.0", "Severity": 6, "Pid": 15, "Fields": {"host_id": "ebf44d051438", "message": "8e01b4e0-f38f-4b16-bc5a-043971170505: matched by is_firefox_desktop; returned DEFER"}}
{"Timestamp": 1493998645733482752, "Type": "antenna.breakpad_resource", "Logger": "antenna", "Hostname": "ebf44d051438", "EnvVersion": "2.0", "Severity": 6, "Pid": 15, "Fields": {"host_id": "ebf44d051438", "message": "8e01b4e0-f38f-4b16-bc5a-043971170505 saved"}}
Statsd
------
Antenna sends data to statsd. Read the code for what's available where and what
it means.
Here are some good ones:
* ``breakpad_resource.incoming_crash``
Counter. Denotes an incoming crash.
* ``throttle.*``
Counters. Throttle results. Possibilities: ``accept``, ``defer``, ``reject``.
* ``breakpad_resource.save_crash.count``
Counter. Denotes a crash has been successfully saved.
* ``breakpad_resource.save_queue_size``
Gauge. Tells you how many things are sitting in the ``crashmover_save_queue``.
.. Note::
If this number is > 0, it means that Antenna is having difficulties keeping
up with incoming crashes.
* ``breakpad_resource.on_post.time``
Timing. This is the time it took to handle the HTTP POST request.
* ``breakpad_resource.crash_save.time``
Timing. This is the time it took to save the crash to Google Cloud Storage.
* ``breakpad_resource.crash_handling.time``
Timing. This is the total time the crash was in Antenna-land from receiving
the crash to saving it to Google Cloud Storage.
Sentry
------
Antenna works with `Sentry `_ and will send
unhandled startup errors and other unhandled errors to Sentry where you can more
easily see what's going on. You can use the hosted Sentry or run your own Sentry
instance--either will work fine.
Cloud storage file hierarchy
----------------------------
If you use the Google Cloud Storage crashstorage component, then crashes get
saved in this hierarchy in the bucket:
* ``/v1/raw_crash//``
* ``/v1/dump_names/``
And then one or more dumps in directories by dump name:
* ``/v1//``
Note that ``upload_file_minidump`` gets converted to ``dump``.
For example, a crash with id ``00007bd0-2d1c-4865-af09-80bc00170413`` and
two dumps "upload_file_minidump" and "upload_file_minidump_flash1" gets
these files saved::
v1/raw_crash/20170413/00007bd0-2d1c-4865-af09-80bc00170413
Raw crash in serialized in JSON.
v1/dump_names/00007bd0-2d1c-4865-af09-80bc00170413
Map of dump_name to file name serialized in JSON.
v1/dump/00007bd0-2d1c-4865-af09-80bc00170413
upload_file_minidump dump.
v1/upload_file_minidump_flash1/00007bd0-2d1c-4865-af09-80bc00170413
upload_file_minidump_flash1 dump.