Question

Parsing user agent into device_type, manufacturer, browser

  • 6 August 2015
  • 5 replies
  • 5105 views

I’m currently using a fairly rudimentary case statement to parse useragent strings into ‘device_type’ (see below).



I was wondering if anyone has done something similar for device manufacturer and/or browser they’d be willing to share.



There are certainly lots of nuances & limitations to accurate useragent detection which will impact the accuracy of data derived from these strings, but it seems the best short-term bath to get a view into what devices and browsers customers are visiting us from.



If you have alternate approaches, I’m all ears as well!



Thanks!



CASE

WHEN ${landing_page_url} LIKE 'file:///private/var/mobile/%' OR ((${useragent} LIKE '%iphone%' OR ${useragent} LIKE '%iPhone%' OR ${useragent} LIKE '%Windows mobile%' OR ${useragent} LIKE '%Windows phone%' OR ${useragent} LIKE '%Windows Phone%' OR ${useragent} LIKE '%Nexus 5%' OR ${useragent} LIKE '%GTI-9300%' OR ${useragent} LIKE '%Nokia%' OR ${useragent} LIKE '%SGH-M919V%' OR ${useragent} LIKE '%SCH-%' OR ${useragent} LIKE '%Mobile%' OR ${useragent} LIKE '%Opera mini%') AND (${useragent} NOT LIKE '%iPad%')) THEN 'mobile'

WHEN ((${useragent} LIKE '%Windows%' OR ${useragent} LIKE '%WOW64%' OR ${useragent} LIKE '%Intel Mac OS%' OR ${useragent} LIKE '%Windows NT 6.1; Trident/7.0%' OR ${useragent} LIKE '%Media Center PC%') AND (${useragent} NOT LIKE '%iPad%')) THEN 'desktop'

WHEN (${useragent} LIKE '%Tablet PC%' OR ${useragent} LIKE '%Touch%' OR ${useragent} LIKE '%MyPhone%' OR ${useragent} LIKE '%iPad%' OR ${useragent} LIKE '%ipad%' OR ${useragent} LIKE '%Tablet%') THEN 'tablet'

WHEN (${useragent} LIKE '%Baiduspider%') ELSE 'unknown'

END

5 replies

Userlevel 4

I wrote some of this myself for our internal usage. Here is my code (warning, its gnarly). I’d add, I’m not an expert here, so this was based on a couple hours of research and there may be corner cases I’m missing. I’d also add that the code below is for Redshift, so may need mild tuning for other dialects.



  - dimension: user_agent

sql: ${TABLE}.user_agent



- dimension: browser

sql: |

CASE

WHEN ${user_agent} LIKE '%Firefox/%' THEN 'Firefox'

WHEN ${user_agent} LIKE '%Chrome/%' OR ${user_agent} LIKE '%CriOS%' THEN 'Chrome'

WHEN ${user_agent} LIKE '%MSIE %' THEN 'IE'

WHEN ${user_agent} LIKE '%MSIE+%' THEN 'IE'

WHEN ${user_agent} LIKE '%Trident%' THEN 'IE'

WHEN ${user_agent} LIKE '%iPhone%' THEN 'iPhone Safari'

WHEN ${user_agent} LIKE '%iPad%' THEN 'iPad Safari'

WHEN ${user_agent} LIKE '%Opera%' THEN 'Opera'

WHEN ${user_agent} LIKE '%BlackBerry%' AND ${user_agent} LIKE '%Version/%' THEN 'BlackBerry WebKit'

WHEN ${user_agent} LIKE '%BlackBerry%' THEN 'BlackBerry'

WHEN ${user_agent} LIKE '%Android%' THEN 'Android'

WHEN ${user_agent} LIKE '%Safari%' THEN 'Safari'

WHEN ${user_agent} LIKE '%bot%' THEN 'Bot'

WHEN ${user_agent} LIKE '%http://%' THEN 'Bot'

WHEN ${user_agent} LIKE '%www.%' THEN 'Bot'

WHEN ${user_agent} LIKE '%Wget%' THEN 'Bot'

WHEN ${user_agent} LIKE '%curl%' THEN 'Bot'

WHEN ${user_agent} LIKE '%urllib%' THEN 'Bot'

ELSE 'Unknown'

END



- dimension: browser_version

sql: |

CASE

WHEN ${browser} = 'Firefox'

THEN SUBSTRING(${user_agent}, POSITION('Firefox' IN ${user_agent}) + 8, 100)

WHEN ${browser} = 'Safari'

THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)

WHEN ${browser} = 'Chrome'

THEN LEFT(

SUBSTRING(${user_agent}

, POSITION('Chrome' IN ${user_agent}) + 7

, 100)

, POSITION(' ' IN SUBSTRING(${user_agent}

, POSITION('Chrome' IN ${user_agent}) + 7

, 100)

)

)

WHEN ${user_agent} LIKE '%Trident%'

THEN '11.0'

WHEN ${browser} = 'IE'

THEN SUBSTRING(${user_agent}, POSITION('MSIE' IN ${user_agent}) + 5, 4)

WHEN ${browser} = 'iPhone Safari'

THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)

WHEN ${browser} = 'iPad Safari'

THEN SUBSTRING(${user_agent}, POSITION('Safari' IN ${user_agent}) + 7, 100)

ELSE 'Unknown'

END



- dimension: platform_start

type: number

hidden: true

sql: |

POSITION('(' IN ${user_agent}) + 1



- dimension: platform_raw

hidden: true

sql: |

SUBSTRING(${user_agent}, ${platform_start}, 100)



- dimension: platform_end

type: number

hidden: true

sql: |

CASE

WHEN POSITION(';' IN ${platform_raw}) = 0

THEN POSITION(')' IN ${platform_raw})

ELSE POSITION(';' IN ${platform_raw})

END



- dimension: platform_end_2

type: number

hidden: true

sql: |

CASE WHEN ${platform_end} = 0 THEN 0 ELSE ${platform_end} - 1 END





- dimension: platform

sql: |

SUBSTRING(${user_agent}, ${platform_start}, ${platform_end_2})

Colin - this is fantastic; super helpful! Many thanks!

@colin, edit to the Chrome logic:



WHEN ${user_agent} LIKE '%Chrome/%' OR ${user_agent} LIKE '%CriOS%' THEN 'Chrome'



per https://developer.chrome.com/multidevice/user-agent

Userlevel 4

Edited, thanks.

Thanks Colin 😄

Reply