How to read a web page using Python

How to read a web page using Python

This will show you how to read a web page from a python script given a URL.

Instructions

    1

    Install the python builder from here. It is free.

    http://www.python.org/download/releases/2.5.2/

    2

    After you have installed Python you can run it from the start menu

    Python 2.5 - IDLE (Python GUI)

    3

    Once the application opens, it looks like notepad. Select File from the menu at the top and New Window. This will open a new text window that you can save your python code to.

    4
    The indented lines will run through the loop.

    Copy the follow code into the new window that just opened.

    import shutil
    import os
    import time
    import datetime
    import math
    import urllib
    from array import array

    filehandle = urllib.urlopen('http://www.loothog.com')

    for lines in filehandle.readlines():
    print lines

    filehandle.close()

    5

    Select File - Save, and name your file anything you want.

    6

    Press F5 to run your code.

    To stop the program from running, click to the first window that opened and select from the menu Shell - Restart Shell

    7

    You can save all those lines into a text file by modifying the code to look like this.

    myFile = open('test.html','w')
    for lines in filehandle.readlines():
    print lines
    myFile.write(lines)

    myFile.close()
    filehandle.close()

    8
    Fun with Python

    Maybe you have some stock quotes that look like this
    AAAC,D,20071210,8.2,8.2,8.2,9.5,1000

    and you want to get the 9.5 close price, then you can split the line up by the commas and access only the 9.5 like this

    myFile = open('test.html','w')
    for lines in filehandle.readlines():
    section = lines.split(',')
    print str(sections[6].strip())
    myFile.write(str(sections[6].strip()))

    myFile.close()
    filehandle.close()

    Note: The str() converts the number to a string and the .strip() will take away extra blanks at the end.

Blog Archive