Over three years have past since my last post here on my pro bono efforts outside of the workplace. My latest pro bono contributions over the last few months have been working as Statistician for the IEEE Chicago section. The Statistician role is new for the section, and works within the Membership Development team.
Much of my time to date has been spent working to get familiar with the data that is currently available on our membership. The SAMIEEE database uses Oracle Business Intelligence Enterprise Edition (OBIEE) 11g. Unfortunately, my early efforts to obtain programmatic access to the database via IP address and port was not successful, so I create rough data sets via the web front end that OBIEE provides and then export for analyses.
At a high level, my CSV exports from OBIEE are imported to PostgreSQL, and then I use the R language with a package called sqldf to perform analyses. Traditional SQL statements can be run against database tables in this manner as though they were R language data frames, making it much easier to work with data for someone such as myself who is adept at SQL as a technical architect.
My plan is to get into more detail regarding the specifics of my setup in later posts. As part of this first post, I want to share the first visualization I created for the IEEE Chicago section a few months ago. A heat map showing the top 26 technical interest profile subjects (TIPS) of active IEEE Chicago section members, created using the above technologies along with specialized R language functions such as heatmap().
The Membership Development Chair has since used this visualization as part of his marketing efforts associated with trade shows, as part of his presentation on our membership. Creating this heat map was fairly straightforward. Although working with the R language environment and language is not intuitive for someone who specializes in Java, getting the right data to create this visualization was really the hard part.
While the TIPS list might not need an explanation, since it is essentially a list of technical interests in which members have shown interest, notice the labels for the three columns. Data was queried and sorted by primary interest, which indicates whether a member actually performs regular work in the area, followed by some work in the area, and some degree of interest, the least significant category of the three.
Traditional engineering concerns may comprise the bulk of the top 26 technical interest profile subjects (a somewhat arbitrary number that I chose, although the break off as to whether a subject made it to the heat map was set at 50 members in the regularly work category), but I found it very interesting because of my profession that Software Engineering was the top regular work subject, and one of the top subjects in the some work category. And Computer was the top subject in the interested category.
Working as Statistician for the IEEE Chicago section, my goals are to work with the Membership Development Chair to meet his immediate needs, as well as explore some ways that we can use the data to help understand our members. In other words, look at our membership at both an aggregate and individual level. In my opinion, some of the best insights will likely come from an investigation of the data at a more granular level, and then determine how findings might apply to the aggregate.