The Problem

jax1 · 10-12-2018 08:00 AM

The Problem

As a Looker model grows in size and sophistication, it will also experience an ever-increasing number of Explores, Views and Fields. Unfortunately, a common side effect of this is model bloat, which typically means a less than great end-user experience.

The Why

Henry is a command line tool that helps determine model bloat in your Looker instance and identify unused content in models and explores. It is meant to provide recommendations that developers can validate in order to cleanup models from unused explores and explores from unused joins and fields, as well as maintain a healthy and user-friendly instance.

The How

The tool currently has three main commands: pulse, analyze and vacuum.

The pulse command runs a number of tests that help determine the overall health of the Looker instance. Among the tests are: connection checks, which confirm that all connections are in working order; query history checks to determine if there are any whose runtime stands out; the use of any legacy features; schedule plans health; and finally, whether the latest version of Looker is being used.

The analyze command gives the ability to scan projects, models and explores. With projects, it scans their content as well as checks for the status of quintessential features for success such as the git connection status and validation requirements. Whereas with models and explores, it provides statistics around unused explores, joins and fields as well as query count.

Finally, the vacuum command can be used with models and explores and it outputs a list of unused content based on predefined criteria. As an example, if we want to find out what fields are unused in the cohorts explore in the model thelook, we can obtain this by running:

$ henry vacuum explores --model thelook --explore cohorts
which yields:

| model   | explore   | unused_joins   | unused_fields                |
|---------+-----------+----------------+------------------------------|
| thelook | cohorts   | N/A            | order_items.created_date     |
|         |           |                | order_items.id               |
|         |           |                | order_items.total_sale_price |
+---------+-----------+----------------+------------------------------+

The Setup

Henry is on PyPI and the easiest way to install it is by running
pip install henry.

If interested in the implementation or in contributing, the source code can be found in the @jax1.

Please note that this tool is open source and is not supported by Looker’s normal support channels. However, any issues encountered while using the tool are encouraged to be filed here: https://github.com/looker-open-source/henry/issues as it will help contributors enhance it further.

IanT

Hi @jax1 we have only just got around to looking at this tool after speaking with @Katie_Hindson at a London Looker meetup. We both are trying to automate the cleanup process. I have documented how it would work to automate cleanup for content, users, spaces, connections and explores but I was wondering if you could see a way to hide/remove models, projects, pdts & datagroups.
Also what do you see as the future/next steps of this tool. Would removing the manual next steps after analysis using Henry and automating these fit with your vision of this tool?

Thanks!

nicholaswongsg

Thanks Joseph! Henry will be a great tool for developers to clean up their codes. I have had face challenges on deciding which block of codes are used and which codes can be deleted. I’m really excited to try this out! 😃

jax1

Hey @IanT! Unfortunately I missed that event but I will do my best to be at the next one.

I did some research to answer your questions however I ended up with more questions myself due to certain behaviour I encountered. I will reach out to you directly via email.

With regards to future/next steps of this tool, having the option to automatically remove/hide fields is not on the radar since the results do require some analysis by the user before deciding whether to remove a field or not. Some features that I am currently working on include being able to count indirect usage of fields, some more actionable content stats and adding the ability to generate some form of report from the tool.

jax1

Happy to hear that Nicholas. I’m interested to know how you get on with this. Please feel free to leave feedback here, in the repo or sent directly to me once you’ve had a chance to try this out.

IanT

When I say automatically take actions I meant that actions could be scripted rather than done by the user in the UI.
As I said previously, some actions the user would want to complete are possible using the API, the rest I don’t think are 😢

izzymiller

A post was split to a new topic: Issues running Henry with self-hosted Looker

Dawid

@jax1 any news on trying to run Henry on Windows?

izzymiller

I’ll wait for Jax to speak, but I think he bears good news!

Hassan_Al_Rabea

I’m running on a windows machine and I wanted to test out Henry. After running pip install and “henry --help” , I hit the following error:

Traceback (most recent call last):
File "c:\users\al\.conda\envs\test\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\al\.conda\envs\test\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\al\.conda\envs\test\Scripts\henry.exe\__main__.py", line 4, in <module>
File "c:\users\al\.conda\envs\test\lib\site-packages\henry\cli.py", line 33, in <module>
logging.config.fileConfig(LOGGING_CONFIG_PATH,
File "c:\users\al\.conda\envs\test\lib\logging\config.py", line 79, in fileConfig
handlers = _install_handlers(cp, formatters)
File "c:\users\al\.conda\envs\test\lib\logging\config.py", line 142, in _install_handlers
args = eval(args, vars(logging))
File "<string>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Is this a known issue or an issue on my end? Thanks for the help!

André1

I’m running on a windows machine and I wanted to test out Henry. After running pip install and “henry --help” , I hit the following error:

Traceback (most recent call last):
File "c:\users\al\.conda\envs\test\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\al\.conda\envs\test\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\al\.conda\envs\test\Scripts\henry.exe\__main__.py", line 4, in <module>
File "c:\users\al\.conda\envs\test\lib\site-packages\henry\cli.py", line 33, in <module>
logging.config.fileConfig(LOGGING_CONFIG_PATH,
File "c:\users\al\.conda\envs\test\lib\logging\config.py", line 79, in fileConfig
handlers = _install_handlers(cp, formatters)
File "c:\users\al\.conda\envs\test\lib\logging\config.py", line 142, in _install_handlers
args = eval(args, vars(logging))
File "<string>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Is this a known issue or an issue on my end? Thanks for the help!

I’m having the same problem over here

Hassan_Al_Rabea

I’m running on a windows machine and I wanted to test out Henry. After running pip install and “henry --help” , I hit the following error:

Traceback (most recent call last):
File "c:\users\al\.conda\envs\test\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\al\.conda\envs\test\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\al\.conda\envs\test\Scripts\henry.exe\__main__.py", line 4, in <module>
File "c:\users\al\.conda\envs\test\lib\site-packages\henry\cli.py", line 33, in <module>
logging.config.fileConfig(LOGGING_CONFIG_PATH,
File "c:\users\al\.conda\envs\test\lib\logging\config.py", line 79, in fileConfig
handlers = _install_handlers(cp, formatters)
File "c:\users\al\.conda\envs\test\lib\logging\config.py", line 142, in _install_handlers
args = eval(args, vars(logging))
File "<string>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Is this a known issue or an issue on my end? Thanks for the help!

I’m having the same problem over here

Hey Andre!
So it seems that Henry doesn’t play well with newer versions of python. I was able to resolve the issue by installing henry in a virtual environment where i had python 3.7.x installed and it worked.

André1

I’m running on a windows machine and I wanted to test out Henry. After running pip install and “henry --help” , I hit the following error:

Traceback (most recent call last):
File "c:\users\al\.conda\envs\test\lib\runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "c:\users\al\.conda\envs\test\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\al\.conda\envs\test\Scripts\henry.exe\__main__.py", line 4, in <module>
File "c:\users\al\.conda\envs\test\lib\site-packages\henry\cli.py", line 33, in <module>
logging.config.fileConfig(LOGGING_CONFIG_PATH,
File "c:\users\al\.conda\envs\test\lib\logging\config.py", line 79, in fileConfig
handlers = _install_handlers(cp, formatters)
File "c:\users\al\.conda\envs\test\lib\logging\config.py", line 142, in _install_handlers
args = eval(args, vars(logging))
File "<string>", line 1
SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \UXXXXXXXX escape

Is this a known issue or an issue on my end? Thanks for the help!

I’m having the same problem over here

Hey Andre!
So it seems that Henry doesn’t play well with newer versions of python. I was able to resolve the issue by installing henry in a virtual environment where i had python 3.7.x installed and it worked.

Hey Hassan,

Thank you for the tip! Now it’s working like a charm

paulopinheiroco

Hi everyone, after running pip install in a Windows Machine in an environment with Pythn 3.7.11 and trying to run “henry pulse” I get the following error (the credentials and url from the .ini file are correct):

Traceback (most recent call last):
File "C:\Users\paulo\.conda\envs\looker_henry\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "C:\Users\paulo\.conda\envs\looker_henry\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\paulo\.conda\envs\looker_henry\Scripts\henry.exe\__main__.py", line 7, in <module>
File "C:\Users\paulo\.conda\envs\looker_henry\lib\site-packages\henry\cli.py", line 20, in main
vacuum.Vacuum.run(user_input)
File "C:\Users\paulo\.conda\envs\looker_henry\lib\site-packages\henry\commands\vacuum.py", line 14, in run
result = vacuum.explores(model=user_input.model, explore=user_input.explore)
File "C:\Users\paulo\.conda\envs\looker_henry\lib\contextlib.py", line 74, in inner
return func(*args, **kwds)
File "C:\Users\paulo\.conda\envs\looker_henry\lib\site-packages\henry\commands\vacuum.py", line 46, in explores
field_stats = self.get_explore_field_stats(e)
File "C:\Users\paulo\.conda\envs\looker_henry\lib\site-packages\henry\modules\fetcher.py", line 279, in get_explore_field_stats
model=explore.model_name, explore=explore.name
File "C:\Users\paulo\.conda\envs\looker_henry\lib\site-packages\henry\modules\fetcher.py", line 235, in get_used_explore_fields
limit="5000",
File "C:\Users\paulo\.conda\envs\looker_henry\lib\site-packages\looker_sdk\sdk\methods.py", line 3837, in run_inline_query
transport_options=transport_options,
File "C:\Users\paulo\.conda\envs\looker_henry\lib\site-packages\looker_sdk\rtl\api_methods.py", line 166, in post
return self._return(response, structure)
File "C:\Users\paulo\.conda\envs\looker_henry\lib\site-packages\looker_sdk\rtl\api_methods.py", line 78, in _return
raise error.SDKError(response.value.decode(encoding=encoding))
looker_sdk.error.SDKError: HTTPSConnectionPool(host='instance_url_not_shown_for_security', port=19999): Read timed out. (read timeout=120)”

Would happy if anyone can help here.

Thanks!

Henry - A Command Line Tool for Looker Instance Cleanup

The Problem

The Why

The How

The Setup