Posts filed under 'Statistics'

Simple Method to Calculate Median in Python

(Note: Please see my latest posts at my new blog!)

def getMedian(numericValues):
  theValues = sorted(numericValues)

  if len(theValues) % 2 == 1:
    return theValues[(len(theValues)+1)/2-1]
  else:
    lower = theValues[len(theValues)/2-1]
    upper = theValues[len(theValues)/2]

    return (float(lower + upper)) / 2  

def validate(valueShouldBe, valueIs):
  print “Value Should Be: %.6f, Value Is: %.6f, Correct: %s” % (valueShouldBe, valueIs, valueShouldBe==valueIs)  

validate(2.5, getMedian([0,1,2,3,4,5]))
validate(2, getMedian([0,1,2,3,4]))
validate(2, getMedian([3,1,2]))
validate(3, getMedian([3,2,3]))
validate(1.234, getMedian([1.234, 3.678, -2.467]))
validate(1.345, getMedian([1.234, 3.678, 1.456, -2.467]))

6 comments March 17, 2008

Computing Chi-Squared P-Value from Contingency Table in Python

(Note: Please see my latest posts at my new blog!)

Update: Here is a link to notes from my Stats class that gives some background  (starting on page 5): http://episun7.med.utah.edu/~alun/teach/stats/week05.pdf

To do this you need to have SciPy installed. Below is one way to do it. I’m sure there’s a more efficient way to do it. But this is working for me. Any feedback is welcome.

def computeContingencyTablePValue(*observedTuples):
  if len(observedTuples) == 0: return None

  for row in observedTuples:
    if len(row) != len(observedTuples[0]): return None

  rowSums = []
  for row in observedTuples:
    rowSums.append(float(sum(row)))

  columnSums = []
  for i in range(len(observedTuples[0])):
    columnSum = 0.0
  for row in observedTuples:
    columnSum += row[i]
    columnSums.append(float(columnSum))

  grandTotal = float(sum(rowSums))

  observedTestStatistic = 0.0
  for i in range(len(observedTuples)):
    for j in range(len(row)):
      expectedValue = (rowSums[i]/grandTotal)*(columnSums[j]/grandTotal)*grandTotal
      observedValue = float(observedTuples[i][j])

  observedTestStatistic += ((observedValue - expectedValue)**2) / expectedValue

  degreesFreedom = (len(columnSums) - 1) * (len(rowSums) - 1)
  return scipy.stats.chisqprob(observedTestStatistic, degreesFreedom)

Add comment February 13, 2008


Categories

Archives

Top Posts