neděle 3. července 2016

Heart rate and sentiment experiment design

Photo GrejGuide.dk @ Flickr
After I finished first experiment and publish article in HEALTHINF 2016 conference this year in Rome I started thinking about next experiment design.

As I mentioned in previous paper improvements I tried to get a lesson from previous mistakes and improve a lot. First, steps are increasing during the day and thus they are not so much independent. Better would be to use heart rate because is totally independent. Second, I can improve my records about sentiment in timing and evaluation. And last but least important is sentiment extraction, instead of supervised learning used in previous work I would like to used unsupervised classification.

So, let's get to details.

Experiment Design

We are still looking for relation between soft data (sentiment) and hard data (measurand). In first experiment it was text recorded via twitter and footsteps. This time it's again text recorded via twitter, but instead footsteps it's heart rate which is more idenpendent.

What's are the main objectives:
  • One month experiment (30 days)
  • 20 tweets per day (600 tweets minimum)
  • Continuous heart rate measurement (24/7)
  • Effective time for heart rate measurement between 7 and 23, i. e. 16 hours a day, rest used for wristband charging
  • No sleep activity monitoring
  • Steps monitoring? Perhaps.

Text collection

  • Still using twitter (API, automatic timeline)
  • Still at least 20 tweets per day. 
  • Keep precise tweet timing. Which means 16 hours (7 to 23), i. e. every 48 minutes. Regularly tweeting every 40 - 50 minutes.
  • Mood expression through activity or feelings description in 140 characters
  • Adding similar identifier hash tag like in previous experiment #xsfb (eXperiment Sentiment Fitness Band)
  • Adding mood hashtag, scale #n,#p and possible #x (negative, positive, neutral) or numbers expressed #0, #1, #2 (negative, neutral, positive) or "stars" scale #1-#5 

Sentiment extraction

  • Unsupervised machine learning - classification
  • Necessity to find out best machine learning solution
  • Possible human evaluation again
    • Split 600 tweets by 10 x 60 tweets
    • Sentiment evaluation of 60 tweets by 60 people → corresponds to equivalent of 6 people for full 600 tweets evaluation → less problems with human overloading

Measured values collection

  • Using 24/7 heart rate wearable
  • Export data through API, ANT/ANT+ and 3rd party application, excel sheet or fit format
  • Output is time series with as much detailed frequency as we can get, at least per minute and shorter.

Measurement precision

  • Delegated as bachelor thesis
  • Up to 3 devices against reference (EKG sensors or wearable with chest HR strap)
  • With additional work about:
    • Principles of HR methods 
    • Principles of HR sensors
    • Collaboration about data export implementation
  • From ½ of column up to 1 column write down results taken from bachelor thesis in my publication as intro to HR wearables precision

Analytical methods

  • Continuous time series heart rate + discrete sentiment time series analysis and their relation (seasonal behavior → repeating patterns and relation to sentiment)
  • Similar to financial market (stocks, Forex) analysis
  • Spearman's rank correlation coefficient 
  • Searching for patterns, trends, correlation, etc.

That's it. It's not "rocket science", but it's science. Next time I will describe how it works. How I get used to it, what was adjustments of experiment design and some overview about results.

Žádné komentáře:

Okomentovat