Question

Parsing user agent into device_type, manufacturer, browser

  • 6 August 2015
  • 5 replies
  • 4829 views

I’m currently using a fairly rudimentary case statement to parse useragent strings into ‘device_type’ (see below).


I was wondering if anyone has done something similar for device manufacturer and/or browser they’d be willing to share.


There are certainly lots of nuances & limitations to accurate useragent detection which will impact the accuracy of data derived from these strings, but it seems the best short-term bath to get a view into what devices and browsers customers are visiting us from.


If you have alternate approaches, I’m all ears as well!


Thanks!


CASE
WHEN ${landing_page_url} LIKE 'file:///private/var/mobile/%' OR ((${useragent} LIKE '%iphone%' OR ${useragent} LIKE '%iPhone%' OR ${useragent} LIKE '%Windows mobile%' OR ${useragent} LIKE '%Windows phone%' OR ${useragent} LIKE '%Windows Phone%' OR ${useragent} LIKE '%Nexus 5%' OR ${useragent} LIKE '%GTI-9300%' OR ${useragent} LIKE '%Nokia%' OR ${useragent} LIKE '%SGH-M919V%' OR ${useragent} LIKE '%SCH-%' OR ${useragent} LIKE '%Mobile%' OR ${useragent} LIKE '%Opera mini%') AND (${useragent} NOT LIKE '%iPad%')) THEN 'mobile'
WHEN ((${useragent} LIKE '%Windows%' OR ${useragent} LIKE '%WOW64%' OR ${useragent} LIKE '%Intel Mac OS%' OR ${useragent} LIKE '%Windows NT 6.1; Trident/7.0%' OR ${useragent} LIKE '%Media Center PC%') AND (${useragent} NOT LIKE '%iPad%')) THEN 'desktop'
WHEN (${useragent} LIKE '%Tablet PC%' OR ${useragent} LIKE '%Touch%' OR ${useragent} LIKE '%MyPhone%' OR ${useragent} LIKE '%iPad%' OR ${useragent} LIKE '%ipad%' OR ${useragent} LIKE '%Tablet%') THEN 'tablet'
WHEN (${useragent} LIKE '%Baiduspider%') ELSE 'unknown'
END

5 replies

Thanks Colin 😄

Userlevel 4

Edited, thanks.

@colin, edit to the Chrome logic:


WHEN ${user_agent} LIKE '%Chrome/%' OR ${user_agent} LIKE '%CriOS%' THEN 'Chrome'


per https://developer.chrome.com/multidevice/user-agent

Colin - this is fantastic; super helpful! Many thanks!

Userlevel 4

I wrote some of this myself for our internal usage. Here is my code (warning, its gnarly). I’d add, I’m not an expert here, so this was based on a couple hours of research and there may be corner cases I’m missing. I’d also add that the code below is for Redshift, so may need mild tuning for other dialects.


  - dimension: user_agent
sql: ${TABLE}.user_agent

- dimension: browser
sql: |
CASE
WHEN ${user_agent} LIKE '%Firefox/%' THEN 'Firefox'
WHEN ${user_agent} LIKE '%Chrome/%' OR ${user_agent} LIKE '%CriOS%' THEN 'Chrome'
WHEN ${user_agent} LIKE '%MSIE %' THEN 'IE'
WHEN ${user_agent} LIKE '%MSIE+%' THEN 'IE'
WHEN ${user_agent} LIKE '%Trident%' THEN 'IE'
WHEN ${user_agent} LIKE '%iPhone%' THEN 'iPhone Safari'
WHEN ${user_agent} LIKE '%iPad%' THEN 'iPad Safari'
WHEN ${user_agent} LIKE '%Opera%' THEN 'Opera'
WHEN ${user_agent} LIKE '%BlackBerry%' AND ${user_agent} LIKE '%Version/%' THEN 'BlackBerry WebKit'
WHEN ${user_agent} LIKE '%BlackBerry%' THEN 'BlackBerry'
WHEN ${user_agent} LIKE '%Android%' THEN 'Android'
WHEN ${user_agent} LIKE '%Safari%' THEN 'Safari'
WHEN ${user_agent} LIKE '%bot%' THEN 'Bot'
WHEN ${user_agent} LIKE '%http://%' THEN 'Bot'
WHEN ${user_agent} LIKE '%www.%' THEN 'Bot'
WHEN ${user_agent} LIKE '%Wget%' THEN 'Bot'
WHEN ${user_agent} LIKE '%curl%' THEN 'Bot'
WHEN ${user_agent} LIKE '%urllib%' THEN 'Bot'
ELSE 'Unknown'
END

- dimension: browser_version
sql: |
CASE
WHEN ${browser} = 'Firefox'
THEN SUBSTRING(${user_agent}, POSITION('Firefox' IN ${user_agent}) + 8, 100)
WHEN ${browser} = 'Safari'
THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)
WHEN ${browser} = 'Chrome'
THEN LEFT(
SUBSTRING(${user_agent}
, POSITION('Chrome' IN ${user_agent}) + 7
, 100)
, POSITION(' ' IN SUBSTRING(${user_agent}
, POSITION('Chrome' IN ${user_agent}) + 7
, 100)
)
)
WHEN ${user_agent} LIKE '%Trident%'
THEN '11.0'
WHEN ${browser} = 'IE'
THEN SUBSTRING(${user_agent}, POSITION('MSIE' IN ${user_agent}) + 5, 4)
WHEN ${browser} = 'iPhone Safari'
THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)
WHEN ${browser} = 'iPad Safari'
THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)
ELSE 'Unknown'
END

- dimension: platform_start
type: number
hidden: true
sql: |
POSITION('(' IN ${user_agent}) + 1

- dimension: platform_raw
hidden: true
sql: |
SUBSTRING(${user_agent}, ${platform_start}, 100)

- dimension: platform_end
type: number
hidden: true
sql: |
CASE
WHEN POSITION(';' IN ${platform_raw}) = 0
THEN POSITION(')' IN ${platform_raw})
ELSE POSITION(';' IN ${platform_raw})
END

- dimension: platform_end_2
type: number
hidden: true
sql: |
CASE WHEN ${platform_end} = 0 THEN 0 ELSE ${platform_end} - 1 END


- dimension: platform
sql: |
SUBSTRING(${user_agent}, ${platform_start}, ${platform_end_2})

Reply