Leverage SQLite for Lightweight ELT in Remote Patient Monitoring Data

If you’re an SQL engineer trying to learn the ropes of data engineering, you might have heard the term ELT and wondered how it differs from ETL. Both are processes for integrating data from multiple sources, but ELT stands for Extract, Load, Transform, while ETL is Extract, Transform, Load. In this blog post, we will explore how to leverage SQLite and SQL views for a lightweight ELT process that integrates disparate tables into a unified view.

Our specific use case will involve aggregating patient remote monitoring data from various devices into a single unified view for Continuous Glucose Monitoring (CGM) tracings. We’ll break down each step in a way that’s approachable and practical, giving you the tools to work with real data while keeping the infrastructure lightweight.

Background on ELT and Why We Use It

The classic ETL strategy involves transforming data before loading it into your storage system, which typically requires more complex workflows, external tools, and a lot of up-front work. In contrast, ELT lets you extract the data as-is, load it into your database, and then transform it in place, often with SQL views, which makes it great for exploratory work or environments with less infrastructure.

SQLite is a great fit here because it’s lightweight, widely supported, and doesn’t require complex setup—perfect for small to medium datasets or rapid prototyping.

Setting the Scene: Ingesting the Data

Imagine we have data from multiple devices—perhaps CGMs, smartwatches, and other monitoring devices—that all capture remote patient monitoring data. After ingesting these data sources, we end up with tables like table_1, table_2, table_3, and table_4 in our SQLite database. Each table represents a different device and has different columns, even though they all describe patient data for similar remote monitoring purposes.

For example:

table_1 has columns like patient_id, device_id, timestamp, glucose_level
table_2 has columns like id, time_recorded, patient_number, cgm_value
table_3 has columns like pat_id, recorded_at, glucose_reading, sensor
table_4 has columns like identifier, time_taken, patient_ref, sugar_level

The Challenge: Creating a Unified View

We need to create a single unified view called patient_rpm_mode_cgm that gives us all CGM tracings in a common format. Since ELT focuses on transforming data in place, we will write SQL to transform and union the data from these disparate tables. Our ultimate goal is to create a view that presents common column names—let’s standardize them to:

patient_id
timestamp
glucose_level
device_type (a new column that does not exist in the physical tables)
source (a new column to indicate the origin table)

Step 1: Understanding the Source Tables

The first step in transforming this data is to understand how each source table maps to our target columns. To standardize the columns:

Source Table	Source Columns	Target Columns
table_1	`patient_id`, `timestamp`, `glucose_level`	`patient_id`, `timestamp`, `glucose_level`, ‘CGM’ AS `device_type`, ‘table_1’ AS `source`
table_2	`patient_number`, `time_recorded`, `cgm_value`	`patient_id`, `timestamp`, `glucose_level`, ‘CGM’ AS `device_type`, ‘table_2’ AS `source`
table_3	`pat_id`, `recorded_at`, `glucose_reading`	`patient_id`, `timestamp`, `glucose_level`, ‘CGM’ AS `device_type`, ‘table_3’ AS `source`
table_4	`patient_ref`, `time_taken`, `sugar_level`	`patient_id`, `timestamp`, `glucose_level`, ‘CGM’ AS `device_type`, ‘table_4’ AS `source`

Step 2: Writing the Transformation Queries

We need to write queries that extract the relevant fields from each table, aliasing the columns to standardize their names, and adding new columns as needed.

For example, to transform table_2:

SELECT
    patient_number AS patient_id,
    time_recorded AS timestamp,
    cgm_value AS glucose_level,
    'CGM' AS device_type,
    'table_2' AS source
FROM
    table_2;

Similarly, for table_3:

SELECT
    pat_id AS patient_id,
    recorded_at AS timestamp,
    glucose_reading AS glucose_level,
    'CGM' AS device_type,
    'table_3' AS source
FROM
    table_3;

Step 3: Combining the Queries with UNION

Next, we need to combine these transformed queries using UNION ALL. Using UNION ALL is appropriate here because it ensures we retain all records, even if they have duplicate values (which may be necessary for auditing or detailed analysis).

Here’s the complete SQL to create our view:

CREATE VIEW patient_rpm_mode_cgm AS
SELECT
    patient_id,
    timestamp,
    glucose_level,
    'CGM' AS device_type,
    'table_1' AS source
FROM
    table_1

UNION ALL

SELECT
    patient_number AS patient_id,
    time_recorded AS timestamp,
    cgm_value AS glucose_level,
    'CGM' AS device_type,
    'table_2' AS source
FROM
    table_2

UNION ALL

SELECT
    pat_id AS patient_id,
    recorded_at AS timestamp,
    glucose_reading AS glucose_level,
    'CGM' AS device_type,
    'table_3' AS source
FROM
    table_3

UNION ALL

SELECT
    patient_ref AS patient_id,
    time_taken AS timestamp,
    sugar_level AS glucose_level,
    'CGM' AS device_type,
    'table_4' AS source
FROM
    table_4

UNION ALL

SELECT
    'unknown' AS patient_id,
    '1970-01-01 00:00:00' AS timestamp,
    0 AS glucose_level,
    'CGM' AS device_type,
    'synthetic' AS source;

Step 4: Adding More Transformations with Views

One of the key benefits of ELT using views is the ability to easily add more transformations without altering the raw data or writing complex ETL pipelines. Here are some additional common transformations that are better handled through views:

1. Standardizing Data Formats

In many cases, different tables may store data in different formats. For example, timestamps might be stored in different formats or time zones. Using a view, you can standardize these formats:

CREATE VIEW standardized_patient_rpm_mode_cgm AS
SELECT
    patient_id,
    DATETIME(timestamp) AS standardized_timestamp,
    glucose_level,
    device_type,
    source
FROM
    patient_rpm_mode_cgm;

This view ensures that all timestamps are in the same format, making downstream analysis much easier.

2. Filtering and Cleaning Data

You may want to exclude certain rows from analysis, such as rows with missing or invalid data. Views are a great way to create a “clean” dataset:

CREATE VIEW clean_patient_rpm_mode_cgm AS
SELECT
    patient_id,
    timestamp,
    glucose_level,
    device_type,
    source
FROM
    patient_rpm_mode_cgm
WHERE
    glucose_level > 0;

This view filters out any rows where glucose_level is 0 or negative, which may represent invalid data.

3. Aggregating Data

You can also use views to create aggregate data that can be used for reporting or analysis. For example, creating a view that provides daily average glucose levels for each patient:

CREATE VIEW daily_avg_glucose AS
SELECT
    patient_id,
    DATE(timestamp) AS day,
    AVG(glucose_level) AS avg_glucose_level
FROM
    patient_rpm_mode_cgm
GROUP BY
    patient_id, day;

This aggregated view makes it easy to analyze trends over time without needing to write aggregation queries repeatedly.

4. Creating Derived Metrics

If you need to create new metrics based on existing columns, views are a great way to handle this. For example, you might want to create a derived metric called glucose_category to categorize glucose levels:

CREATE VIEW patient_glucose_category AS
SELECT
    patient_id,
    timestamp,
    glucose_level,
    CASE
        WHEN glucose_level < 70 THEN 'Low'
        WHEN glucose_level BETWEEN 70 AND 140 THEN 'Normal'
        WHEN glucose_level > 140 THEN 'High'
        ELSE 'Unknown'
    END AS glucose_category,
    device_type,
    source
FROM
    patient_rpm_mode_cgm;

This view adds a new column that categorizes glucose levels into ‘Low’, ‘Normal’, or ‘High’.

Step 5: Validating the Unified View

After creating the view, it’s always good practice to validate the results. You can use a SELECT query to make sure everything looks right:

SELECT * FROM patient_rpm_mode_cgm LIMIT 10;

Review the data to ensure that the column names are standardized and that the values align as expected. Pay particular attention to the timestamp column to ensure formats are consistent.

Step 6: Leveraging the View for Downstream Analysis

With the patient_rpm_mode_cgm view in place, downstream processes can now treat this data as a consistent and unified source. Analysts can run queries like:

SELECT
    patient_id,
    AVG(glucose_level) AS avg_glucose_level
FROM
    patient_rpm_mode_cgm
GROUP BY
    patient_id;

This allows for seamless analysis without needing to worry about device-specific table structures.

Why ELT with SQLite?

You might wonder why ELT is a good fit for this scenario. Here are some reasons:

Flexibility: ELT allows you to load data as-is and apply transformations later when you have a better understanding of the data.
Simplicity: SQLite is simple to set up, and using views means that transformations are written declaratively with SQL, which is easy for teams to understand and modify.
Lightweight: No heavyweight ETL tools are required, which makes this approach perfect for small datasets or prototyping.

Conclusion

By using SQLite and SQL views, we’ve demonstrated a lightweight and modern approach to ETL (or, more precisely, ELT) that helps simplify the process of integrating data from multiple sources. This approach allows for greater flexibility, and by leveraging SQL views, we can keep the transformation logic declarative and transparent.

Whether you’re prototyping a new data pipeline, working with smaller datasets, or need a low-maintenance integration solution, this ELT strategy with SQLite is an excellent option. We hope this guide helps you get started on your journey towards modern data engineering!

Next Steps

To deepen your understanding, try adding more transformations or aggregations to the patient_rpm_mode_cgm view. You could, for example, normalize the timestamp formats or add additional metadata to the view to help with analysis. Feel free to experiment and explore how SQLite’s capabilities can further simplify your data engineering workflow.

Modern Lightweight ETL with SQLite

Background on ELT and Why We Use It

Setting the Scene: Ingesting the Data

The Challenge: Creating a Unified View

Step 1: Understanding the Source Tables

Step 2: Writing the Transformation Queries

Step 3: Combining the Queries with UNION

Step 4: Adding More Transformations with Views

1. Standardizing Data Formats

2. Filtering and Cleaning Data

3. Aggregating Data

4. Creating Derived Metrics

Step 5: Validating the Unified View

Step 6: Leveraging the View for Downstream Analysis

Why ELT with SQLite?

Conclusion

Next Steps

Was this post helpful?

Related articles

How surveilr Transforms Secure Health Data Exchange in Clinical Workflows

Evidence-Based Decision Making

Unlocking surveilr's Full Potential the Smart Way: The Power of SQL and SQL views capabilities

Quality Management for Safety-Critical Industries

How `surveilr` Transforms Diabetes Research Study Data Management

RSSDs: Portability of Excel, Power of SQL

The surveilr Ecosystem: Core and Patterns

surveilr: A Local-First, Edge-based SQL-Centric Integration Engine

Vendor Assessment and Corporate Compliance Evidence Warehouse

Why Stateful Integration?

Maximizing Curated Content with Surveilr’s Content Assembler Pattern

Harnessing SQL for Compliance: Building a Powerful Compliance Explorer Tool

Sign in

Sign up

Forgot password?

Modern Lightweight ETL with SQLite

Background on ELT and Why We Use It

Setting the Scene: Ingesting the Data

The Challenge: Creating a Unified View

Step 1: Understanding the Source Tables

Step 2: Writing the Transformation Queries

Step 3: Combining the Queries with UNION

Step 4: Adding More Transformations with Views

1. Standardizing Data Formats

2. Filtering and Cleaning Data

3. Aggregating Data

4. Creating Derived Metrics

Step 5: Validating the Unified View

Step 6: Leveraging the View for Downstream Analysis

Why ELT with SQLite?

Conclusion

Next Steps

Was this post helpful?

Related articles

How surveilr Transforms Secure Health Data Exchange in Clinical Workflows

Evidence-Based Decision Making

Unlocking surveilr's Full Potential the Smart Way: The Power of SQL and SQL views capabilities

Quality Management for Safety-Critical Industries

How `surveilr` Transforms Diabetes Research Study Data Management

RSSDs: Portability of Excel, Power of SQL

The surveilr Ecosystem: Core and Patterns

surveilr: A Local-First, Edge-based SQL-Centric Integration Engine

Vendor Assessment and Corporate Compliance Evidence Warehouse

Why Stateful Integration?

Maximizing Curated Content with Surveilr’s Content Assembler Pattern

Harnessing SQL for Compliance: Building a Powerful Compliance Explorer Tool