Question: In snowflake s web interface, Create a worksheet that selects data from the medical labs table DXRX _ DATA _ PROD.MEDICAL _ LABS.PATIENT _ PROCEDURES.
In snowflakes web interface, Create a worksheet that selects data from the medical labs table
DXRXDATAPROD.MEDICALLABS.PATIENTPROCEDURES. From this data table we would like to select columns:
DVPATIENTID
TESTORDERID
ACCESSIONNUMBER
REFERRINGPROVIDERNPINUMBER
ORDERINGPROVIDERNPINUMBER
TESTORDEREDDATE
TESTREPORTEDDATE
PERFORMINGORGANIZATIONNAME
PANELNAME
TESTNAME
TESTCODE
RESULTNAME
RESULTCODE
RESULTVALUE
RESULTCOMMENTS.
We want to filter for rows that contain HER or HER in the results fields for tests reported
in This filter infers that the patients have been tested for the biomarker HER In cases
where there is no test report date, please make use of the ordered date. This filter should be
used in the further activities. Create a new column that selects testorderid as default, if
missing then uses the accession number and name the column as TESTID
The solution for this question is :
WITH herpatients AS
SELECT
DVPATIENTIDTESTORDERIDACCESSIONNUMBER,REFERRINGPROVIDERNPINUMBER,
ORDERINGPROVIDERNPINUMBER,TESTORDEREDDATE,TESTREPORTDATE,
PERFORMINGORGANIZATIONNAME,PANELNAME,TESTNAME,TESTCODE,
RESULTNAME,RESULTCODE,RESULTVALUE,RESULTCOMMENTS,
COALESCE TESTORDERIDACCESSIONNUMBER AS TESTID
COALESCE TESTORDEREDDATE,TESTREPORTDATE AS TESTDATE
FROM DXRXDATAPROD.MEDICALLABS.PATIENTPROCEDURES
WHERE
REGEXPLIKERESULTNAME, HERERBBi OR
REGEXPLIKERESULTVALUE, HERERBBi OR
REGEXPLIKERESULTCOMMENTS, HERERBBi
AND
TESTDATE LIKE
SELECT
FROM herpatients;
From the data retrieved in exercise you will have the same test results reported over
multiple dates and sometimes performing organisation names. This duplication happens in a
system were the same result maybe captured multiple times and seems to cluster within
~ days periods. How would we collapse this data to show record where the results are
the same? Please explain the approach in bullet points and show in SQL Please show
examples of before and after your deduplication process for individual patients results.
a The question is done on Snowflake SQL Worksheet. So the SQL query should be compatible to it
b Please explain how the question works and also the logic behind the question?
c Please give me a solution and the explanation for this question as well.
I can give some snippet of a possible solution :
Q AS SELECT DVPATIENTID TESTCODE, TESTNAME, ORDERING PROVIDERNPINUMBER, PERFORMING ORGANIZATIONNAME, RESULTCODE, RESULTNAME,
RESULTVALUE, RESULTCOMMENTS, TESTDATE,
Add a column to calculate the date difference in days using LAG and DATEDIFF
DATEDIFF day LAGTESTDATE OVER PARTITION BY DVPATIENTID TESTCODE, TESTNAME, ORDERING PROVIDERNPINUMBER, RESULTCODE,
RESULTNAME, RESULTVALUE, RESULTCOMMENTS ORDER BY TESTDATE TESTDATE AS DATEDIFF
FROM hertests
Filter out the rows where the date difference is less than days
SELECT FROM Q
WHERE DATEDIFF IS NULL OR DATEDIFF
ORDER BY DVPATIENTID TESTCODE, TESTNAME, ORDERING PROVIDERNPINUMBER, PERFORMING ORGANIZATIONNAME, RESULTCODE, RESULTNAME,
RESULTVALUE, RESULTCOMMENTS, TESTDATE;
explanation:
LAG function to lag the test date over the partition by clause on on all the columns which make up the
duplicate values
CHECK for date difference or NULL for the first case where no
k to k rows after dedupliucation based on columns: DVPATIENTID TESTCODE, TESTNAME,
ORDERING PROVIDERNPINUMBER, RESULTCODE, RESULTNAME, RESULTVALUE,
RESULTCOMMENTS.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
