.. highlight:: rst
.. _scmlpick:
########
scmlpick
########
.. scmlpick
.. =============
``scmlpick`` is a SeisComP module that performs real-time seismic phase picking
using machine-learning models (EQCCT by default), designed for operational networks
and research workflows. It supports multi-pipeline execution, per-station configuration
via SeisComP **bindings**, and **playback** for offline reprocessing.
Seamless integration with `SeisComP `_ (`official documentation `_)
enables direct publishing of :term:`Pick` objects to the messaging system, use of station
bindings, and interoperability with other SeisComP modules.
Installation
=============
SeisComP System
^^^^^^^^^^^^^^^
- **Required**: `SeisComP `_
- **Minimum version**: Fully compatible with SeisComP releases ≥ ``4.0.0``
- **Recommended**: ``6.0.0`` or higher for improved stability and performance
Follow the official installation instructions:
`https://docs.seiscomp.de/ `_
Once installed, verify that SeisComP is properly configured and accessible in your environment.
Operating System
^^^^^^^^^^^^^^^^
- **Tested on**: ``Linux``
- **Validated on**: ``Ubuntu 22.04 LTS``
Programming Language
^^^^^^^^^^^^^^^^^^^^
- **Python ≥ 3.10**
.. note::
Python versions prior to ``3.10`` are **not supported** due to incompatibilities
with the ``ray`` library.
Required Python Packages
^^^^^^^^^^^^^^^^^^^^^^^^
It is strongly recommended to install these packages inside a dedicated virtual
environment (e.g., Conda) to avoid dependency conflicts.
.. code-block:: bash
conda create -n scmlpick python=3.10
conda activate scmlpick
Install the following Python packages:
.. code-block:: bash
pip install ray
pip install numpy==1.26.4 # Avoid numpy ≥ 2.0 to maintain compatibility
pip install pandas
pip install obspy
pip install tensorflow
pip install silence_tensorflow
Clone the Repository
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Clone the project repository into any local directory:
.. code-block:: bash
git clone https://github.com/ut-beg-texnet/scmlpick.git
Deploy the Code into the SeisComP Installation Directory
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Copy all necessary files into your SeisComP installation using ``rsync`` to preserve
the directory structure:
.. code-block:: bash
rsync -av /path/to/your-cloned-repository/seiscomp/ /path/to/your-seiscomp-installation/
- Replace ``/path/to/your-cloned-repository/seiscomp/`` with the absolute path to the seiscomp folder in your cloned repository.
- Replace ``/path/to/your-seiscomp-installation/`` with your SeisComP root (typically ``$SEISCOMP_ROOT/``).
.. note::
This step integrates the module into the SeisComP environment, preserving file structure
and permissions.
Predictor Installation
^^^^^^^^^^^^^^^^^^^^^^
The ``scmlpick-predicctor`` module must be installed **after** after the deployment of the scmlpikc reposity into seiscomp path.
The installation procedure depends on whether a virtual environment is being used.
**If NOT using a virtual environment**
.. code-block:: bash
cd $SEISCOMP_ROOT/share/scmlpick/tools/scmlpick-predicctor
pip3 install -e .
**If using a Conda or other virtual environment**
Activate your environment (replace ``scmlpick`` with your actual environment name):
.. code-block:: bash
conda activate scmlpick
Navigate to the predictor directory:
.. code-block:: bash
cd $SEISCOMP_ROOT/share/scmlpick/tools/scmlpick-predicctor
Install the predictor module:
.. code-block:: bash
pip install -e .
.. note::
Once installed, verify that SeisComP is properly configured and accessible in your environment.
Getting Started
============
Real-Time Setup
^^^^^^^^^^^^^^^
Before starting using scmlpick in real-time you have to be sure in complete the following steps:
1. Create profile in the pipelines section in the module configuration (see :confval:`pipelines.profiles`.)
2. Create and configure pick target and location target groups in the profile. (see :confval:`pipelines.profiles.$name.pickTargetGroup` and :confval:`pipelines.profiles.$name.locTargetGroup`)
3. Create a **binding profile** with pick enable option (see :ref:`bindings`), the name should match with the module profile created in step 1.
4. Assign the binding profile to stations/streams in `scconfig `_ (or edit bindings files).
5. Start your messaging: ``scmaster start`` (or your standard SeisComP workflow).
6. Launch ``scmlpick``:
.. code-block:: bash
scmlpick -u user --debug
For executing on the command line simply call it with appropriate options, e.g:
.. code-block:: bash
seiscomp exec scmlpick -h
Next Steps
^^^^^^^^^^
* Create **module profiles** (see :ref:`profiles`) to separate dataflows and create specific processing for specific purposes.
* Depending the number of stations and resources available configure ray parameters (see :ref:`permormance`)
* Tune critial parameters like P and S models, probability thresholds, time windows lenght overlaps and more in the module configuration.
* Use (see :ref:`playback`) to postprocessing or validate parameters runing in playback mode with mseed files or historical data.
.. _profiles:
Module Profiles
===============
In **scmlpick**, in contrast to other SeisComP modules, creating aliases is **not** recommended.
Each alias starts a separate instance that runs in parallel and increases CPU/RAM usage.
Instead, run a single instance that processes all stations and use (see :confval:`pipelines.profiles`)
to define different workflows. Profiles let you run multiple independent pipelines
within one process, avoiding the need for ``scmlpick`` aliases.
Each profile can define its own:
* :confval:`pipelines.profiles.$name.pickTargetGroup` all picks emitted by this profile are published to that specific messaging group.
* :confval:`pipelines.profiles.$name.locTargetGroup` this profile subscribes to locations from that messaging group.
* :confval:`pipelines.profiles.$name.authorTarget` locations carrying this author are checked against the profile’s quality-control parameters.
* :confval:`pipelines.profiles.$name.region` polygon used to evaluate (filter/validate) locations processed by this profile.
Segment of scmlpick.cfg incuding two profiles:
.. code-block:: ini
# List with profiles names.
pipelines.profiles = PROF1, PROF2
# This is the name of the group to send pick messages for specific profile
pipelines.profiles.PROF1.pickTargetGroup = PPROF1
# This is the name of the group to send location messages for specific profile
pipelines.profiles.PROF1.locTargetGroup = LPROF1
# This author should match the origin (location) author for locations to be
# checked against quality control parameters.
pipelines.profiles.PROF1.authorTarget = LOCSATPROF1
# The BNA file defining regions (closed polygons) or tuple with 4 elements.
# Inside this region, the origin evaluation status is TrueOrgEvalStat, and
# outside is FalseOrgEvalStat. If this is set to none, then the origin location
# will not be checked against any polygon.
pipelines.profiles.PROF1.region = @DATADIR@/scmlpick/bna/PROF1.bna
# This is the name of the group to send pick messages for specific profile
pipelines.profiles.PROF2.pickTargetGroup = PPROF2
# This is the name of the group to send location messages for specific profile
pipelines.profiles.PROF2.locTargetGroup = LPROF2
# This author should match the origin (location) author for locations to be
# checked against quality control parameters.
pipelines.profiles.PROF2.authorTarget = LOCSATPROF2
# The BNA file defining regions (closed polygons) or tuple with 4 elements.
# Inside this region, the origin evaluation status is TrueOrgEvalStat, and
# outside is FalseOrgEvalStat. If this is set to none, then the origin location
# will not be checked against any polygon.
pipelines.profiles.PROF2.region = @DATADIR@/scmlpick/bna/PROF2.bna
.. note::
Each **module profile** must have a corresponding **binding profile** that
associates the profile with specific stations; see :ref:`bindings`.
The **module profile name must match the binding profile name**.
.. _bindings:
Bindings
========
In SeisComP, *bindings* attach module-specific parameters to stations. ``scmlpick``
uses bindings to activate specific stations to be processed by an ``scmlpick`` profile.
Where to Configure
^^^^^^^^^^^^^^^^^^
* In `scconfig `_ assign the binding to the desired stations (bindings tab).
* Or edit files under ``$SEISCOMP_ROOT/etc/bindings/scmlpick/``
Each binding enables picking for the stations **bound** to it:
* :confval:`profiles.$name.pickEnable` Boolean flag to enable picking in this binding.
You can also set a station-specific **filter**. This filter is applied at the
beginning of the processing chain, before the ML algorithm runs:
* :confval:`profiles.$name.filter` Parameter following the standard SeisComP filter grammar (`reference `_).
.. _playback:
Playback
========
``scmlpick`` supports **playback** for offline (non–real-time) processing. This mode is
useful for validation, benchmarking, parameter tuning, and research.
Data sources
^^^^^^^^^^^^
Playback can run against:
* **Pre-downloaded miniSEED files** fastest and fully repeatable.
* **Historical data via the configured SeisComP RecordStream** (e.g., FDSN/CAPS/SeedLink).
.. note::
Querying large time windows from a RecordStream can be slow and will re-fetch data
on every run. Prefer RecordStream playback for **short** time intervals or **one-off**
executions. For repeated experiments, use local miniSEED files.
Requirements
^^^^^^^^^^^^
Playback uses the **same configuration** as real-time mode:
* A **module profile** (see :ref:`profiles`).
* **Bindings** whose profile name **matches** the module profile (see :ref:`bindings`).
* **Stations** assigned to that binding profile.
* A valid data source: miniSEED files available locally **or** a correctly configured
RecordStream endpoint.
.. note::
Keep :confval:`eqcct.windowLength`/:confval:`eqcct.timeShift` and other configuration parameters (see :ref:`Tuning`)
consistent with your real-time setup if you want comparable latency and behavior between modes.
Command-line options
^^^^^^^^^^^^^^^^^^^^
Use the following options to run ``scmlpick`` in **playback** mode.
Input source (choose one)
-------------------------
* ``--recordstream``
Read historical data directly from a SeisComP RecordStream (e.g., FDSN/CAPS/SeedLink).
* ``--mseed-files ``
Read from one or more local miniSEED files, or a directory containing them.
Playback window & profile
-------------------------
* ``--profile ``
Select the module profile to use. The profile name must match an existing
module profile **and** its corresponding binding profile.
* ``--start "YYYY-MM-DD HH:MM:SS"``
* ``--end "YYYY-MM-DD HH:MM:SS"``
Time window for playback, applied to the chosen input source.
Output
------
* ``--output-file ``
Write results to an XML file.
* ``--database``
Send results directly to the SeisComP database.
Ray parallelism (optional)
--------------------------
* ``--cpu-number ``
Number of CPUs available to Ray. If omitted, the value from the module configuration is used.
* ``--max-tasks ``
Maximum number of concurrent tasks sent to the predictor. If omitted, the configured
default is used.
Running Playback
^^^^^^^^^^^^^^^^
*Using local miniSEED:*
.. code-block:: bash
scmlpick \
--playback \
--profile playback \
--mseed-files /data/mseed/2025-08-01/ \
--start "2025-08-01 00:00:00" \
--end "2025-08-01 06:00:00" \
--output-file /tmp/scmlpick_playback.xml \
-u user \
--debug
*Using RecordStream:*
.. code-block:: bash
scmlpick \
--playback \
--profile playback \
--recordstream \
--start "2025-08-01 00:00:00" \
--end "2025-08-01 03:00:00" \
--database
-u user \
--debug
Tuning
======
ML algorithm configuration
--------------------------
Controls which pre-trained models are used and how strict the decision rules are.
Select P/S models appropriate for your seismicity and set probability thresholds
that govern when picks are emitted; overlap and batch size affect latency and throughput.
* :confval:`eqcct.Pmodel`
* :confval:`eqcct.Smodel`
* :confval:`eqcct.eqcctPthr`
* :confval:`eqcct.eqcctSthr`
* :confval:`eqcct.eqcctOverlap`
* :confval:`eqcct.eqcctBatchSize`
Data sent for processing
------------------------
Defines the sliding-window strategy and submission timing for inference. These parameters
control window length, overlap, pre-filter warm-up, station availability requirements, and
the delays/waits that separate first/second computations and delayed-data paths.
* :confval:`eqcct.windowLength`
* :confval:`eqcct.timeShift`
* :confval:`eqcct.filterShift`
* :confval:`eqcct.minStasBulk`
* :confval:`eqcct.minDelayStasBulk`
* :confval:`eqcct.firstWait`
* :confval:`eqcct.secondWait`
* :confval:`eqcct.maxWait`
* :confval:`eqcct.startLatency`
* :confval:`eqcct.timeRemoveTrace`
* :confval:`eqcct.traceDelay`
Ray configuration
-----------------
Tunes parallelism and queueing for the predictor backend. Use these settings to balance
CPU usage and concurrency against your latency budget.
* :confval:`ray.numCPUs`
* :confval:`ray.maxTasksQueue`
Locations quality control
-------------------------
Applies simple, rule-based checks to location results. Uncertainty and geometry thresholds
determine whether an origin is assigned the “true” or “false” evaluation status; both
statuses are configurable to match your operational workflow.
* :confval:`qcheck.latUncTHR`
* :confval:`qcheck.lonUncTHR`
* :confval:`qcheck.depthUncTHR`
* :confval:`qcheck.depthTHR`
* :confval:`qcheck.azGapTHR`
* :confval:`qcheck.TrueOrgEvalStat`
* :confval:`qcheck.FalseOrgEvalStat`
.. toctree::
:maxdepth: 1
:caption: ML methods
apps/ml_methods
ML methods
==========
.. toctree::
:maxdepth: 1
:caption: Resources Optimization
apps/resources_optimization
Resources Optimization
======================
References
==========
You can cite in-text using footnote-style citations, for example:
“As shown by Ross et al. [ross2018]_ …”
.. [ross2018] Ross, Z.E., et al. (2018). Generalized Seismic Phase Detection with Deep Learning. *Bulletin of the Seismological Society of America*, 108(5), 2894–2901. https://doi.org/10.1785/0120180080
.. [allen1982] Allen, R.V. (1982). Automatic phase pickers: Their present use and future prospects. *Bulletin of the Seismological Society of America*, 72(6), S225–S242.
Configuration
=============
| :file:`etc/defaults/global.cfg`
| :file:`etc/defaults/scmlpick.cfg`
| :file:`etc/global.cfg`
| :file:`etc/scmlpick.cfg`
| :file:`~/.seiscomp/global.cfg`
| :file:`~/.seiscomp/scmlpick.cfg`
scmlpick inherits `global options `_.
.. confval:: whiteChanns
Type: *list:string*
Define the list of channles types that are allowed to process. Example: HH, CH
.. confval:: eqcct.Pmodel
Type: *string*
EQCCT model to detect P phases.
.. confval:: eqcct.Smodel
Type: *string*
EQCCT model to detect S phases.
.. confval:: eqcct.windowLength
Type: *int*
Define the data window length in seconds to send to eqcct.
Default is ``60``.
.. confval:: eqcct.filterShift
Type: *double*
Define the time shift to initialize the filter. This data will not be processed, so timeShift must be greater than this. \(in seconds\)
Default is ``2.5``.
.. confval:: eqcct.probThreshold
Type: *double*
Minimum probability to reach to send a new pick.
Default is ``0.001``.
.. confval:: eqcct.timeShift
Type: *int*
Define the time in seconds to overlap with the previous window.
Default is ``30``.
.. confval:: eqcct.minStasBulk
Type: *int*
Minimum number of stations with windowLength data available to send to EQCCT after firstWait.
This number should be less than the total stations.
At least this number of real\-time stations must be available to produce picks.
Default is ``5``.
.. confval:: eqcct.minDelayStasBulk
Type: *int*
Minimum number of delayed stations with windowLength data available to send to EQCCT after firstWait.
This number should be less than the total stations.
Default is ``5``.
.. confval:: eqcct.firstWait
Type: *int*
Minimum delay between now and the end time of the last minute saved in memory to compute all data available. First computation.
Default is ``5``.
.. confval:: eqcct.secondWait
Type: *int*
Minimum time to wait after first computation to proceed with Second computation.
Default is ``3``.
.. confval:: eqcct.maxWait
Type: *int*
Maximum time to wait after first computation to proceed with second computation.
Default is ``10``.
.. confval:: eqcct.startLatency
Type: *int*
Define time in seconds to wait before sending the first data packet to eqcct.
Default is ``60``.
.. confval:: eqcct.timeRemoveTrace
Type: *int*
Maximum time in seconds to wait before remove traces from the processing stream.
Default is ``600``.
.. confval:: eqcct.traceDelay
Type: *int*
Minimum delay in seconds between the current time and the start time of data acquired to send data to delayed data processing.
Default is ``120``.
.. confval:: eqcct.eqcctPthr
Type: *double*
Minimum probability to reach to send a new P pick.
Default is ``0.001``.
.. confval:: eqcct.eqcctSthr
Type: *double*
Minimum probability to reach to send a new S pick.
Default is ``0.002``.
.. confval:: eqcct.eqcctOverlap
Type: *int*
If set the detection and picking are performed in overlapping windows.
Default is ``0``.
.. confval:: eqcct.eqcctBatchSize
Type: *int*
Batch size. This wont affect the speed much but can affect the performance. A value beteen 200 to 1000 is recommended.
Default is ``1``.
.. confval:: eqcct.gpuID
Type: *int*
Id of GPU used for the prediction. If using CPU set to None.
Default is ``0``.
.. confval:: eqcct.gpuLimit
Type: *int*
Set the maximum percentage of memory usage for the GPU.
Default is ``1``.
.. confval:: pipelines.profiles
Type: *list:String*
Profile names must match the profiles in the bindings configuration.
.. note::
**pipelines.profiles.\***
*Profiles including specific configurations.*
.. note::
**pipelines.profiles.$name.\***
$name is a placeholder for the name to be used and needs to be added to :confval:`pipelines.profiles` to become active.
.. code-block:: sh
pipelines.profiles = a,b
pipelines.profiles.a.value1 = ...
pipelines.profiles.b.value1 = ...
# c is not active because it has not been added
# to the list of pipelines.profiles
pipelines.profiles.c.value1 = ...
.. confval:: pipelines.profiles.$name.pickTargetGroup
Type: *String*
This is the name of the group to send pick messages for specific profile
Default is ``PICK``.
.. confval:: pipelines.profiles.$name.locTargetGroup
Type: *String*
This is the name of the group to send location messages for specific profile
Default is ``LOCATION``.
.. confval:: pipelines.profiles.$name.authorTarget
Type: *String*
This author should match the origin \(location\) author for locations to be checked against quality control parameters.
Default is ``SCMLPICK``.
.. confval:: pipelines.profiles.$name.region
Type: *path*
The BNA file defining regions \(closed polygons\) or tuple with 4 elements. Inside this region, the origin evaluation status is TrueOrgEvalStat, and outside is FalseOrgEvalStat. If this is set to none, then the origin location will not be checked against any polygon.
Default is ``none``.
.. confval:: ray.numCPUs
Type: *int*
Number of CPUs allocated for running the run_picker\(\) to create the time chunks.
Default is ``5``.
.. confval:: ray.maxTasksQueue
Type: *int*
Maximum number of tasks that are sent to the predictor at the same time.
Default is ``50``.
.. confval:: qcheck.latUncTHR
Type: *int*
Set the latitude uncertainty threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat.
Default is ``20``.
.. confval:: qcheck.lonUncTHR
Type: *int*
Set the longitude uncertainty threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat.
Default is ``20``.
.. confval:: qcheck.depthUncTHR
Type: *int*
Set the depth uncertainty threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat.
Default is ``20``.
.. confval:: qcheck.depthTHR
Type: *int*
Set the depth threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat.
Default is ``20``.
.. confval:: qcheck.azGapTHR
Type: *int*
Set the azimuthal gap threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat.
Default is ``270``.
.. confval:: qcheck.TrueOrgEvalStat
Type: *int*
Set the origin evaluation status to use when the origin fulfills all quality thresholds. Use integers from 0 to 5. Each integer means following next list: 0\=preliminary, 1\=confirmed, 2\=reviewed, 3\=final, 4\=rejected, 5\=reported.
Default is ``5``.
.. confval:: qcheck.FalseOrgEvalStat
Type: *int*
Set the origin evaluation status to use when the origin does not fulfill one or more quality thresholds. Use integers from 0 to 5. Each integer means following next list: 0\=preliminary, 1\=confirmed, 2\=reviewed, 3\=final, 4\=rejected, 5\=reported
Default is ``4``.
Bindings
========
.. confval:: profiles
Type: *list:String*
Profile names must match the profiles in the module configuration.
.. note::
**profiles.\***
*Activate the station for a specific profile. Profiles must exist in both the module configuration and the bindings.*
.. note::
**profiles.$name.\***
$name is a placeholder for the name to be used and needs to be added to :confval:`profiles` to become active.
.. code-block:: sh
profiles = a,b
profiles.a.value1 = ...
profiles.b.value1 = ...
# c is not active because it has not been added
# to the list of profiles
profiles.c.value1 = ...
.. confval:: profiles.$name.pickEnable
Type: *boolean*
Enables\/disables picking on a station.
Default is ``false``.
.. confval:: profiles.$name.filter
Type: *string*
Defines the filter to be used for picking.