<h2>Using R-Studio with Meditech Data Repository</h2>

Using R-Studio with Meditech Data Repository

I’m often asked about my tool set for working with the Meditech Data Repository (DR) to mine clinical data. In most cases I use the tools that come bundled with the underlying MS SQL Server. They cost nothing extra, so there’s no excuse for not using them.

SQL Server Management Studio (SSMS) and SQL Server Reporting Services (SSRS) are simple to use once setup and configured properly and they’re great for basic observational analysis (reports and dashboards). If management need more eye catching eye-candy in their dashboards, there’s always d3.js (which I absolutely love!).

Lately, I’ve been spending quality time with the free and open source R-Studio writing R code. This is widely used in non-healthcare settings but for some reason hasn’t made in-roads in Healthcare (with the exception of academic and research environments). While the learning curve is initially steep it is proving very worth while.

R-Studio can connect directly to the DR with the RODBC package and gives you access to a huge armory of data processing, math and statistics tools along with advanced charting and illustration.

If you really want to do “predictive analytics” with very large multivariate datasets, this free and open source package offers machine learning, modeling and more statistical packages than I know what to do with. Linear regressions and quantiles in single commands. Much easier than TSQL.

Installation of R and then R-Studio couldn’t be simpler with both Mac and Windows installers.

To get you started this code snippet will pull the latest data from [livedb.[dbo].[AdmVisits], select a column and throw a chart on the page. From here on, the possibilities are limited by your imagination and persistence.

# Load an ODBC database handler
# Load plotting library
# Create a connection handle
dbhandle <- odbcDriverConnect('driver={SQL Server};
# Fire the SQL query and assign results to a data.frame
results <- as.data.frame(sqlQuery(dbhandle, '
SELECT TOP 500 * FROM [livedb].[dbo].[AdmVisits]
# Close the database connection
# Summarize the returned data to the console
# Subset your results for a desired column
race <- results [ , "RaceName"]
# Illustrate it
Adventures in Data...
Hey There!
What is This?