Simple Method to Calculate Median in Python

March 17, 2008 at 10:14 pm 20 comments

(Note: Please see my latest posts at my new blog!)

def getMedian(numericValues):
  theValues = sorted(numericValues)

  if len(theValues) % 2 == 1:
    return theValues[(len(theValues)+1)/2-1]
  else:
    lower = theValues[len(theValues)/2-1]
    upper = theValues[len(theValues)/2]

    return (float(lower + upper)) / 2  

def validate(valueShouldBe, valueIs):
  print “Value Should Be: %.6f, Value Is: %.6f, Correct: %s” % (valueShouldBe, valueIs, valueShouldBe==valueIs)  

validate(2.5, getMedian([0,1,2,3,4,5]))
validate(2, getMedian([0,1,2,3,4]))
validate(2, getMedian([3,1,2]))
validate(3, getMedian([3,2,3]))
validate(1.234, getMedian([1.234, 3.678, -2.467]))
validate(1.345, getMedian([1.234, 3.678, 1.456, -2.467]))

About these ads

Entry filed under: CodeSnippet, Python, Statistics. Tags: , .

Filtering Data in Python (Example of Functional Programming Approach) Append a List to a List in Python

20 Comments Add your own

  • 1. cw  |  September 30, 2008 at 4:00 am

    one less computation if you do this instead:)

    return theValues[(len(theValues)-1)/2]

  • 2. drgoettel  |  July 16, 2009 at 10:05 am

    this doesn’t work for continuos variables…
    does it?

  • 3. utah_guy  |  July 16, 2009 at 2:11 pm

    You mean the code in the post? Or the one in the first comment?

  • 4. drgoettel  |  July 16, 2009 at 4:31 pm

    both.
    A simply way to calculate median in python is using numpy module, you can read documentation at http://docs.scipy.org/doc/numpy/user/

  • 5. utah_guy  |  October 9, 2009 at 6:05 pm

    It should work for both. The numpy module can be used, too. This is partially for instructional purposes but also for those who don’t want to install external libraries such as numpy.

  • 6. utah_guy  |  October 9, 2009 at 6:07 pm

    Actually, I should correct that statement. This is designed to work with integers and floats. It should also work with discrete variables with some minor tweaks.

  • 7. Oliver Nina  |  February 26, 2010 at 3:44 pm

    Why not using the mean() function in numpy
    numpy.mean(numericValues)

  • 8. Oliver Nina  |  February 26, 2010 at 3:45 pm

    or median
    numpy.median(numericValues)

  • 9. utah_guy  |  February 27, 2010 at 6:29 am

    Olivery that’s probably a great way to go. As long as you are willing to install that library. Part of the point of this post is to show how the logic behind how you would find the median, rather than to say it’s the solution people should necessarily be using.

  • 10. Troy McConaghy  |  June 4, 2010 at 8:23 pm

    You do the len(theValues) calculation four times on the same theValues list. You could save some time by doing it once and storing the result in a variable, then using the value in that variable:

    def getMedian(numericValues):
    theValues = sorted(numericValues)
    count = len(theValues)

    if count % 2 == 1:
    return theValues[(count+1)/2-1]
    else:
    lower = theValues[count/2-1]
    upper = theValues[count/2]

    return (float(lower + upper)) / 2

  • 11. Troy McConaghy  |  June 4, 2010 at 8:25 pm

    Note: I had proper Python indenting when I entered the comment above but the commenting system removed it.

  • 12. Noe  |  June 28, 2010 at 4:20 pm

    Hola no entendi el codigo, alguien me puede ayudar y mandarlo de una manera mas clara, me urge.

    Saludos

  • 13. Noe  |  June 28, 2010 at 4:25 pm

    Hello I did not understand the code, someone can help me and send it in a more clear, I urge.

    Greeting

  • 14. aperture11  |  February 18, 2011 at 9:07 am

    I can’t, it keep saying “list indices must be integers, not float”

  • 15. aperture11  |  February 18, 2011 at 9:09 am

    Another way say %2 != 0

  • 16. Shears  |  May 1, 2012 at 9:03 pm

    Why to “programmers” always want to show each other up? The post is useful and does what it says on the box. Either appreciate it for keep walking.

  • 17. neurotik  |  May 13, 2012 at 5:05 pm

    @Shears: I don’t think it’s about showing other people up, but rather a discussion of better / alternate solutions. To that end here’s my version of it (only works with v2.5+):

    def median(values):
    “”” Returns the median value from a list of numbers “””
    s = sorted(values)
    l = len(s)
    return float(s[(l-1)/2]) if (l%2 == 1) else float((s[l/2]+s[(l/2)-1]))/2

  • 18. ms4py  |  July 10, 2012 at 2:14 pm

    @neurotik You can speed up the floor division with the real floor division or with bit shifting:

    In [6]: %timeit (13 – 1) / 2
    10000000 loops, best of 3: 53.3 ns per loop

    In [7]: %timeit 13 // 2
    10000000 loops, best of 3: 21.4 ns per loop

    In [8]: %timeit 13 >> 1
    10000000 loops, best of 3: 21.1 ns per loop

  • 19. Chad  |  July 25, 2012 at 6:02 pm

    Since integer division truncates, we can get rid of the if statements:

    def median(values):
    s=sorted(values)
    l=int(len(s)) #this is probably redundant
    return (float(s[(l-1)/2]+s[l/2)])/2

    If the length is 6, this code returns the mean of s[2] and s[3], but if the the length is 5, it returns the mean of s[2] and s[2].

  • 20. asas  |  March 28, 2014 at 3:08 pm

    Since integer division truncates, we can get rid of the if statements:
    def median(values):
    s=sorted(values)
    l=int(len(s)) #this is probably redundant
    return (float(s[(l-1)/2]+s[l/2)])/2
    If the length is 6, this code returns the mean of s[2] and s[3], but if the the length is 5, it returns the mean of s[2] and s[2].

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed



Follow

Get every new post delivered to your Inbox.

%d bloggers like this: