drdyer9051
Posts: 3
Joined: Thu Aug 16, 2018 5:55 pm

noobie needs help getting data from XML webpage

Thu Aug 16, 2018 6:22 pm

hello, thanks for the help.

I am trying to get the data from one field into a variable
this is XML data after being formatted to utf-8
<?xml version="1.0" encoding="UTF-8"?>
<response xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XML-Schema-instance" version="1.2" xsi:noNamespaceSchemaLocation="http://aviationweather.gov/adds/schema/metar1_2.xsd">
<request_index>48683285</request_index>
<data_source name="metars" />
<request type="retrieve" />
<errors />
<warnings />
<time_taken_ms>8</time_taken_ms>
<data num_results="1">
<METAR>
<raw_text>KGKT 161655Z AUTO VRB03KT 10SM CLR 30/21 A3017 RMK A01</raw_text>
<station_id>KGKT</station_id>
<observation_time>2018-08-16T16:55:00Z</observation_time>
<latitude>35.85</latitude>
<longitude>-83.53</longitude>
<temp_c>30.0</temp_c>
<dewpoint_c>21.0</dewpoint_c>
<wind_dir_degrees>0</wind_dir_degrees>
<wind_speed_kt>3</wind_speed_kt>
<visibility_statute_mi>10.0</visibility_statute_mi>
<altim_in_hg>30.171259</altim_in_hg>
<quality_control_flags>
<auto>TRUE</auto>
<auto_station>TRUE</auto_station>
</quality_control_flags>
<sky_condition sky_cover="CLR" />
<flight_category>VFR</flight_category>
<metar_type>METAR</metar_type>
<elevation_m>309.0</elevation_m>
</METAR>
</data>
</response>


>>>
the data that I need is between <flight_category> ..... <flight_category> sixth line from bottom.
I cannot find an example on my many searches. do i need to parse, or search strings..
code is

Code: Select all

import urllib3

http = urllib3.PoolManager()
response = http.request('GET','https://aviationweather.gov/adds/dataserver_current/httpparam?dataSource=metars&requestType=retrieve&format=xml&hoursBeforeNow=3&mostRecent=true&stationString=PHNL%20KGKT')
data = response.data.decode('utf-8')
print(data)

User avatar
DougieLawson
Posts: 33620
Joined: Sun Jun 16, 2013 11:19 pm
Location: Basingstoke, UK
Contact: Website

Re: noobie needs help getting data from XML webpage

Thu Aug 16, 2018 6:56 pm

If you need to parse HTML or XML, you'll need to use BeautifulSoup4.
Microprocessor, Raspberry Pi & Arduino Hacker
Mainframe database troubleshooter
MQTT Evangelist
Twitter: @DougieLawson

2012-18: 1B*5, 2B*2, B+, A+, Z, ZW, 3Bs*3, 3B+

Any DMs sent on Twitter will be answered next month.

drdyer9051
Posts: 3
Joined: Thu Aug 16, 2018 5:55 pm

Re: noobie needs help getting data from XML webpage

Thu Aug 16, 2018 7:18 pm

i pip installed it but could not get it to work :oops:
no module named 'bs4'

Code: Select all

  sudo apt-get install python-bs4

edit... it will work with python but not python3

User avatar
rpdom
Posts: 12747
Joined: Sun May 06, 2012 5:17 am
Location: Ankh-Morpork

Re: noobie needs help getting data from XML webpage

Thu Aug 16, 2018 7:54 pm

drdyer9051 wrote:
Thu Aug 16, 2018 7:18 pm
i pip installed it but could not get it to work :oops:
no module named 'bs4'

Code: Select all

  sudo apt-get install python-bs4

edit... it will work with python but not python3
You need to install the Python 3 version

Code: Select all

sudo apt-get install python3-bs4

User avatar
DougieLawson
Posts: 33620
Joined: Sun Jun 16, 2013 11:19 pm
Location: Basingstoke, UK
Contact: Website

Re: noobie needs help getting data from XML webpage

Sun Aug 19, 2018 12:02 am

I hacked this python3 program together to parse some of the XML from the Aviation Weather Service (I've downloaded a sample XML file from EGLL (London Heathrow)).

Code: Select all

#!/usr/bin/python3
from bs4 import BeautifulSoup
infile = open('egll.xml', 'r') # CHANGE ME to fetch the current URL
contents = infile.read()
soup = BeautifulSoup(contents, 'xml')

tags = [ 'raw_text', 'station_id', 'observation_time', 'latitude', 'longitude',
        'temp_c', 'dewpoint_c', 'wind_dir_degrees', 'wind_speed_kt', 'wind_gust_kt',
        'visibility_statute_mi', 'altim_in_hg', 'sea_level_pressure_mb', 'quality_control_flags',
        'wx_string', 'sky_condition', 'flight_category', 'three_hr_pressure_tendency_mb',
        'maxT_c', 'minT_c', 'maxT24hr_c', 'minT24hr_c', 'precip_in', 'pcp3hr_in',
        'pcp6hr_in', 'pcp24hr_in', 'snow_in', 'vert_vis_ft', 'metar_type', 'elevation_m']

raw_text = soup.find_all(tags[0])
station_id = soup.find_all(tags[1])
observation_time = soup.find_all(tags[2])
lat = soup.find_all(tags[3])
long= soup.find_all(tags[4])
temp_c = soup.find_all(tags[5])
dewpoint_c = soup.find_all(tags[6])
wind_dir = soup.find_all(tags[7])
wind_speed = soup.find_all(tags[8])
wind_gust = soup.find_all(tags[9])
visibility = soup.find_all(tags[10])
altitude = soup.find_all(tags[11])
msl = soup.find_all(tags[12])
wx = soup.find_all(tags[14])
sky_cond = soup.find_all(tags[15])
press_3hr = soup.find_all(tags[17])
maxT = soup.find_all(tags[18])
minT = soup.find_all(tags[19])
maxT24 = soup.find_all(tags[20])
minT24 = soup.find_all(tags[21])
precip = soup.find_all(tags[22])
precip3hr = soup.find_all(tags[23])
precip6hr = soup.find_all(tags[24])
precip24hr = soup.find_all(tags[25])
snow = soup.find_all(tags[26])
vert_vis = soup.find_all(tags[27])
elevation = soup.find_all(tags[29])

for i in range(0, len(raw_text)):
    #print(raw_text[i].get_text())
    print("----------------------------")
    print("Station: "+station_id[i].get_text(), "Time: "+observation_time[i].get_text())
    print("Lat: "+lat[i].get_text()+"°", "Long: "+long[i].get_text()+"°", "Elevation: "+elevation[i].get_text()+"m", "Barometer: "+altitude[i].get_text()+"inHg")
    print("Temp: "+temp_c[i].get_text()+"°C", "Dewpoint: "+dewpoint_c[i].get_text()+"°C")
    print("Wind direction: "+wind_dir[i].get_text()+"°","Speed: "+wind_speed[i].get_text()+"Kt")
    try:
        print("Wind gusting: "+wind_gust[i].get_text()+"Kt")
    except:
        pass
Microprocessor, Raspberry Pi & Arduino Hacker
Mainframe database troubleshooter
MQTT Evangelist
Twitter: @DougieLawson

2012-18: 1B*5, 2B*2, B+, A+, Z, ZW, 3Bs*3, 3B+

Any DMs sent on Twitter will be answered next month.

drdyer9051
Posts: 3
Joined: Thu Aug 16, 2018 5:55 pm

Re: noobie needs help getting data from XML webpage

Mon Aug 20, 2018 12:47 pm

thank you so much for taking the time to help me.
it is greatly appreciated.

hippy
Posts: 3594
Joined: Fri Sep 09, 2011 10:34 pm
Location: UK

Re: noobie needs help getting data from XML webpage

Mon Aug 20, 2018 2:38 pm

If you only want to extract one field and that field only appears once in the file you could do -

Code: Select all

def ExractField(s,fieldName):
  n = s.find("<"+fieldName+">")
  if n >= 0:
    s = s[n+len(fieldName)+2:]
    n = s.find("</"+fieldName+">")
    if n >= 0:
      return s[:n]
  return ""

data = ... contents of file ...
flightCategory = ExtractField(data,"flight_category")
print( flightCategory )
That iss the brute force, no additional imports required, method. You can extract other fields from the same data if you wish -

Code: Select all

flightCategory = ExtractField(data,"flight_category")
metarType = ExtractField(data, "metar_type")
print( flightCategory, metarType )

Return to “Python”

Who is online

Users browsing this forum: petermeigs and 12 guests