scmlpick

scmlpick is a SeisComP module that performs real-time seismic phase picking using machine-learning models (EQCCT by default), designed for operational networks and research workflows. It supports multi-pipeline execution, per-station configuration via SeisComP bindings, and playback for offline reprocessing.

Seamless integration with SeisComP (official documentation) enables direct publishing of Pick objects to the messaging system, use of station bindings, and interoperability with other SeisComP modules.

Installation

SeisComP System

  • Required: SeisComP

  • Minimum version: Fully compatible with SeisComP releases ≥ 4.0.0

  • Recommended: 6.0.0 or higher for improved stability and performance

Follow the official installation instructions: https://docs.seiscomp.de/ Once installed, verify that SeisComP is properly configured and accessible in your environment.

Operating System

  • Tested on: Linux

  • Validated on: Ubuntu 22.04 LTS

Programming Language

  • Python ≥ 3.10

Note

Python versions prior to 3.10 are not supported due to incompatibilities with the ray library.

Required Python Packages

It is strongly recommended to install these packages inside a dedicated virtual environment (e.g., Conda) to avoid dependency conflicts.

conda create -n scmlpick python=3.10
conda activate scmlpick

Install the following Python packages:

pip install ray
pip install numpy==1.26.4      # Avoid numpy ≥ 2.0 to maintain compatibility
pip install pandas
pip install obspy
pip install tensorflow
pip install silence_tensorflow

Clone the Repository

Clone the project repository into any local directory:

git clone https://github.com/ut-beg-texnet/scmlpick.git

Deploy the Code into the SeisComP Installation Directory

Copy all necessary files into your SeisComP installation using rsync to preserve the directory structure:

rsync -av /path/to/your-cloned-repository/seiscomp/ /path/to/your-seiscomp-installation/
  • Replace /path/to/your-cloned-repository/seiscomp/ with the absolute path to the seiscomp folder in your cloned repository.

  • Replace /path/to/your-seiscomp-installation/ with your SeisComP root (typically $SEISCOMP_ROOT/).

Note

This step integrates the module into the SeisComP environment, preserving file structure and permissions.

Predictor Installation

The scmlpick-predicctor module must be installed after after the deployment of the scmlpikc reposity into seiscomp path. The installation procedure depends on whether a virtual environment is being used.

If NOT using a virtual environment

cd $SEISCOMP_ROOT/share/scmlpick/tools/scmlpick-predicctor
pip3 install -e .

If using a Conda or other virtual environment

Activate your environment (replace scmlpick with your actual environment name):

conda activate scmlpick

Navigate to the predictor directory:

cd $SEISCOMP_ROOT/share/scmlpick/tools/scmlpick-predicctor

Install the predictor module:

pip install -e .

Note

Once installed, verify that SeisComP is properly configured and accessible in your environment.

Getting Started

Real-Time Setup

Before starting using scmlpick in real-time you have to be sure in complete the following steps:

  1. Create profile in the pipelines section in the module configuration (see pipelines.profiles.)

  2. Create and configure pick target and location target groups in the profile. (see pipelines.profiles.$name.pickTargetGroup and pipelines.profiles.$name.locTargetGroup)

  3. Create a binding profile with pick enable option (see Bindings), the name should match with the module profile created in step 1.

  4. Assign the binding profile to stations/streams in scconfig (or edit bindings files).

  5. Start your messaging: scmaster start (or your standard SeisComP workflow).

  6. Launch scmlpick:

    scmlpick -u user --debug
    

For executing on the command line simply call it with appropriate options, e.g:

seiscomp exec scmlpick -h

Next Steps

  • Create module profiles (see Module Profiles) to separate dataflows and create specific processing for specific purposes.

  • Depending the number of stations and resources available configure ray parameters (see permormance)

  • Tune critial parameters like P and S models, probability thresholds, time windows lenght overlaps and more in the module configuration.

  • Use (see Playback) to postprocessing or validate parameters runing in playback mode with mseed files or historical data.

Module Profiles

In scmlpick, in contrast to other SeisComP modules, creating aliases is not recommended. Each alias starts a separate instance that runs in parallel and increases CPU/RAM usage. Instead, run a single instance that processes all stations and use (see pipelines.profiles) to define different workflows. Profiles let you run multiple independent pipelines within one process, avoiding the need for scmlpick aliases.

Each profile can define its own:

Segment of scmlpick.cfg incuding two profiles:

# List with profiles names.
pipelines.profiles = PROF1, PROF2

# This is the name of the group to send pick messages for specific profile
pipelines.profiles.PROF1.pickTargetGroup = PPROF1

# This is the name of the group to send location messages for specific profile
pipelines.profiles.PROF1.locTargetGroup = LPROF1

# This author should match the origin (location) author for locations to be
# checked against quality control parameters.
pipelines.profiles.PROF1.authorTarget = LOCSATPROF1

# The BNA file defining regions (closed polygons) or tuple with 4 elements.
# Inside this region, the origin evaluation status is TrueOrgEvalStat, and
# outside is FalseOrgEvalStat. If this is set to none, then the origin location
# will not be checked against any polygon.
pipelines.profiles.PROF1.region = @DATADIR@/scmlpick/bna/PROF1.bna

# This is the name of the group to send pick messages for specific profile
pipelines.profiles.PROF2.pickTargetGroup = PPROF2

# This is the name of the group to send location messages for specific profile
pipelines.profiles.PROF2.locTargetGroup = LPROF2

# This author should match the origin (location) author for locations to be
# checked against quality control parameters.
pipelines.profiles.PROF2.authorTarget = LOCSATPROF2

# The BNA file defining regions (closed polygons) or tuple with 4 elements.
# Inside this region, the origin evaluation status is TrueOrgEvalStat, and
# outside is FalseOrgEvalStat. If this is set to none, then the origin location
# will not be checked against any polygon.
pipelines.profiles.PROF2.region = @DATADIR@/scmlpick/bna/PROF2.bna

Note

Each module profile must have a corresponding binding profile that associates the profile with specific stations; see Bindings. The module profile name must match the binding profile name.

Bindings

In SeisComP, bindings attach module-specific parameters to stations. scmlpick uses bindings to activate specific stations to be processed by an scmlpick profile.

Where to Configure

  • In scconfig assign the binding to the desired stations (bindings tab).

  • Or edit files under $SEISCOMP_ROOT/etc/bindings/scmlpick/

Each binding enables picking for the stations bound to it:

You can also set a station-specific filter. This filter is applied at the beginning of the processing chain, before the ML algorithm runs:

Playback

scmlpick supports playback for offline (non–real-time) processing. This mode is useful for validation, benchmarking, parameter tuning, and research.

Data sources

Playback can run against:

  • Pre-downloaded miniSEED files fastest and fully repeatable.

  • Historical data via the configured SeisComP RecordStream (e.g., FDSN/CAPS/SeedLink).

Note

Querying large time windows from a RecordStream can be slow and will re-fetch data on every run. Prefer RecordStream playback for short time intervals or one-off executions. For repeated experiments, use local miniSEED files.

Requirements

Playback uses the same configuration as real-time mode:

  • A module profile (see Module Profiles).

  • Bindings whose profile name matches the module profile (see Bindings).

  • Stations assigned to that binding profile.

  • A valid data source: miniSEED files available locally or a correctly configured RecordStream endpoint.

Note

Keep eqcct.windowLength/eqcct.timeShift and other configuration parameters (see Tuning) consistent with your real-time setup if you want comparable latency and behavior between modes.

Command-line options

Use the following options to run scmlpick in playback mode.

Input source (choose one)

  • --recordstream Read historical data directly from a SeisComP RecordStream (e.g., FDSN/CAPS/SeedLink).

  • --mseed-files <PATH> Read from one or more local miniSEED files, or a directory containing them.

Playback window & profile

  • --profile <NAME> Select the module profile to use. The profile name must match an existing module profile and its corresponding binding profile.

  • --start "YYYY-MM-DD HH:MM:SS"

  • --end   "YYYY-MM-DD HH:MM:SS" Time window for playback, applied to the chosen input source.

Output

  • --output-file <FILE> Write results to an XML file.

  • --database Send results directly to the SeisComP database.

Ray parallelism (optional)

  • --cpu-number <N> Number of CPUs available to Ray. If omitted, the value from the module configuration is used.

  • --max-tasks <N> Maximum number of concurrent tasks sent to the predictor. If omitted, the configured default is used.

Running Playback

Using local miniSEED:

scmlpick \
  --playback \
  --profile playback \
  --mseed-files /data/mseed/2025-08-01/ \
  --start "2025-08-01 00:00:00" \
  --end   "2025-08-01 06:00:00" \
  --output-file /tmp/scmlpick_playback.xml \
  -u user \
  --debug

Using RecordStream:

scmlpick \
  --playback \
  --profile playback \
  --recordstream \
  --start "2025-08-01 00:00:00" \
  --end   "2025-08-01 03:00:00" \
  --database
  -u user \
  --debug

Tuning

Controls which pre-trained models are used and how strict the decision rules are. Select P/S models appropriate for your seismicity and set probability thresholds that govern when picks are emitted; overlap and batch size affect latency and throughput.

Defines the sliding-window strategy and submission timing for inference. These parameters control window length, overlap, pre-filter warm-up, station availability requirements, and the delays/waits that separate first/second computations and delayed-data paths.

Tunes parallelism and queueing for the predictor backend. Use these settings to balance CPU usage and concurrency against your latency budget.

Applies simple, rule-based checks to location results. Uncertainty and geometry thresholds determine whether an origin is assigned the “true” or “false” evaluation status; both statuses are configurable to match your operational workflow.

ML methods

Resources Optimization

References

You can cite in-text using footnote-style citations, for example: “As shown by Ross et al. [ross2018] …”

[ross2018]

Ross, Z.E., et al. (2018). Generalized Seismic Phase Detection with Deep Learning. Bulletin of the Seismological Society of America, 108(5), 2894–2901. https://doi.org/10.1785/0120180080

[allen1982]

Allen, R.V. (1982). Automatic phase pickers: Their present use and future prospects. Bulletin of the Seismological Society of America, 72(6), S225–S242.

Configuration

etc/defaults/global.cfg
etc/defaults/scmlpick.cfg
etc/global.cfg
etc/scmlpick.cfg
~/.seiscomp/global.cfg
~/.seiscomp/scmlpick.cfg

scmlpick inherits global options.

whiteChanns

Type: list:string

Define the list of channles types that are allowed to process. Example: HH, CH

eqcct.Pmodel

Type: string

EQCCT model to detect P phases.

eqcct.Smodel

Type: string

EQCCT model to detect S phases.

eqcct.windowLength

Type: int

Define the data window length in seconds to send to eqcct. Default is 60.

eqcct.filterShift

Type: double

Define the time shift to initialize the filter. This data will not be processed, so timeShift must be greater than this. (in seconds) Default is 2.5.

eqcct.probThreshold

Type: double

Minimum probability to reach to send a new pick. Default is 0.001.

eqcct.timeShift

Type: int

Define the time in seconds to overlap with the previous window. Default is 30.

eqcct.minStasBulk

Type: int

Minimum number of stations with windowLength data available to send to EQCCT after firstWait. This number should be less than the total stations. At least this number of real-time stations must be available to produce picks. Default is 5.

eqcct.minDelayStasBulk

Type: int

Minimum number of delayed stations with windowLength data available to send to EQCCT after firstWait. This number should be less than the total stations. Default is 5.

eqcct.firstWait

Type: int

Minimum delay between now and the end time of the last minute saved in memory to compute all data available. First computation. Default is 5.

eqcct.secondWait

Type: int

Minimum time to wait after first computation to proceed with Second computation. Default is 3.

eqcct.maxWait

Type: int

Maximum time to wait after first computation to proceed with second computation. Default is 10.

eqcct.startLatency

Type: int

Define time in seconds to wait before sending the first data packet to eqcct. Default is 60.

eqcct.timeRemoveTrace

Type: int

Maximum time in seconds to wait before remove traces from the processing stream. Default is 600.

eqcct.traceDelay

Type: int

Minimum delay in seconds between the current time and the start time of data acquired to send data to delayed data processing. Default is 120.

eqcct.eqcctPthr

Type: double

Minimum probability to reach to send a new P pick. Default is 0.001.

eqcct.eqcctSthr

Type: double

Minimum probability to reach to send a new S pick. Default is 0.002.

eqcct.eqcctOverlap

Type: int

If set the detection and picking are performed in overlapping windows. Default is 0.

eqcct.eqcctBatchSize

Type: int

Batch size. This wont affect the speed much but can affect the performance. A value beteen 200 to 1000 is recommended. Default is 1.

eqcct.gpuID

Type: int

Id of GPU used for the prediction. If using CPU set to None. Default is 0.

eqcct.gpuLimit

Type: int

Set the maximum percentage of memory usage for the GPU. Default is 1.

pipelines.profiles

Type: list:String

Profile names must match the profiles in the bindings configuration.

Note

pipelines.profiles.* Profiles including specific configurations.

Note

pipelines.profiles.$name.* $name is a placeholder for the name to be used and needs to be added to pipelines.profiles to become active.

pipelines.profiles = a,b
pipelines.profiles.a.value1 = ...
pipelines.profiles.b.value1 = ...
# c is not active because it has not been added
# to the list of pipelines.profiles
pipelines.profiles.c.value1 = ...
pipelines.profiles.$name.pickTargetGroup

Type: String

This is the name of the group to send pick messages for specific profile Default is PICK.

pipelines.profiles.$name.locTargetGroup

Type: String

This is the name of the group to send location messages for specific profile Default is LOCATION.

pipelines.profiles.$name.authorTarget

Type: String

This author should match the origin (location) author for locations to be checked against quality control parameters. Default is SCMLPICK.

pipelines.profiles.$name.region

Type: path

The BNA file defining regions (closed polygons) or tuple with 4 elements. Inside this region, the origin evaluation status is TrueOrgEvalStat, and outside is FalseOrgEvalStat. If this is set to none, then the origin location will not be checked against any polygon. Default is none.

ray.numCPUs

Type: int

Number of CPUs allocated for running the run_picker() to create the time chunks. Default is 5.

ray.maxTasksQueue

Type: int

Maximum number of tasks that are sent to the predictor at the same time. Default is 50.

qcheck.latUncTHR

Type: int

Set the latitude uncertainty threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat. Default is 20.

qcheck.lonUncTHR

Type: int

Set the longitude uncertainty threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat. Default is 20.

qcheck.depthUncTHR

Type: int

Set the depth uncertainty threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat. Default is 20.

qcheck.depthTHR

Type: int

Set the depth threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat. Default is 20.

qcheck.azGapTHR

Type: int

Set the azimuthal gap threshold. Above this value, the origin evaluation status is FalseOrgEvalStat, and below is TrueOrgEvalStat. Default is 270.

qcheck.TrueOrgEvalStat

Type: int

Set the origin evaluation status to use when the origin fulfills all quality thresholds. Use integers from 0 to 5. Each integer means following next list: 0=preliminary, 1=confirmed, 2=reviewed, 3=final, 4=rejected, 5=reported. Default is 5.

qcheck.FalseOrgEvalStat

Type: int

Set the origin evaluation status to use when the origin does not fulfill one or more quality thresholds. Use integers from 0 to 5. Each integer means following next list: 0=preliminary, 1=confirmed, 2=reviewed, 3=final, 4=rejected, 5=reported Default is 4.

Bindings

profiles

Type: list:String

Profile names must match the profiles in the module configuration.

Note

profiles.* Activate the station for a specific profile. Profiles must exist in both the module configuration and the bindings.

Note

profiles.$name.* $name is a placeholder for the name to be used and needs to be added to profiles to become active.

profiles = a,b
profiles.a.value1 = ...
profiles.b.value1 = ...
# c is not active because it has not been added
# to the list of profiles
profiles.c.value1 = ...
profiles.$name.pickEnable

Type: boolean

Enables/disables picking on a station. Default is false.

profiles.$name.filter

Type: string

Defines the filter to be used for picking.