* Split context id migration into states and events tasks
Since events can finish much earlier than states we
would keep looking at the table because states as not
done. Make them seperate tasks
* add retry dec
* fix migration happening twice
* another case
* Deduplicate event_types in the events table
* Deduplicate event_types in the events table
* more fixes
* adjust
* adjust
* fix product
* fix tests
* adjust
* migrate
* migrate
* migrate
* more test fixes
* more test fixes
* fix
* migration test
* adjust
* speed up
* fix index
* fix more tests
* handle db failure
* preload
* tweak
* adjust
* fix stale docs strings, remove dead code
* refactor
* fix slow tests
* coverage
* self join to resolve query performance
* fix typo
* no need for quiet
* no need to drop index already dropped
* remove index that will never be used
* drop index sooner as we no longer use it
* Revert "remove index that will never be used"
This reverts commit 461aad2c52.
* typo
* Load pending state attributes and event data ids at startup
Since we queue all events to be processed after startup
we can have a thundering herd of queries to prime the
LRUs of event data and state attributes ids. Since we
know we are about to process a chunk of events we can
fetch all the ids in two queries
* lru
* fix hang
* Fix recorder LRU being destroyed if event session is reopened
We would clear the LRU in _close_event_session but
it would never get replaced with an LRU again so
it would leak memory if the event session is reopened
* Fix recorder LRU being destroyed if event session is reopened
We would clear the LRU in _close_event_session but
it would never get replaced with an LRU again so
it would leak memory if the event session is reopened
* cleanup
* Adjust size of recorder LRU based on number of entities
If there are a large number of entities the cache would
get thrashed as there were more state attributes being
recorded than the size of the cache. This meant we had
to go back to the database to do lookups frequently when
an instance has more than 2048 entities that change
frequently
* add a test
* do not actually record 4096 states
* patch target
* patch target
* Chunk MariaDB data migration to avoid running out of buffer space
This will make the migration slower but since the innodb_buffer_pool_size
is using the defaul to 128M and not tuned to the db size there is a
risk of running out of buffer space for large databases
* Update homeassistant/components/recorder/migration.py
* hard code since bandit thinks its an injection
* Update homeassistant/components/recorder/migration.py
* guard against manually modified data/corrupt db
* adjust to 10k per chunk
* adjust to 50k per chunk
* memory still just fine at 250k
* but slower
* commit after each chunk to reduce lock pressure
* adjust
* set to 0 if null so we do not loop forever (this should only happen if the data is missing)
* set to 0 if null so we do not loop forever (this should only happen if the data is missing)
* tweak
* tweak
* limit cleanup
* lower limit to give some more buffer
* lower limit to give some more buffer
* where required for sqlite
* sqlite can wipe as many as needed with no limit
* limit on mysql only
* chunk postgres
* fix limit
* tweak
* fix reference
* fix
* tweak for ram
* postgres memory reduction
* defer cleanup
* fix
* same order
If the there are a lot of excluded events for the recorder, it
can have a performance impact as the list has to be searched
every time an event fires in HA
* Validate common statistics db schema errors on start
* Fix test
* Add tests
* Adjust tests
* Disable statistics schema validation in tests
* Update after rebase
* Fire events when long term statistics is updated
* Allow the new events to be subscribed to by anyone
* Address review comments
* Finish renaming events
* Finish renaming events
* Fix do_adhoc_statistics
* Adjust tests
* Adjust tests
* Initial orjson support take 2
Still need to work out problem building wheels
--
Redux of #72754 / #32153 Now possible since the following is solved:
ijl/orjson#220 (comment)
This implements orjson where we use our default encoder. This does not implement orjson where `ExtendedJSONEncoder` is used as these areas tend to be called far less frequently. If its desired, this could be done in a followup, but it seemed like a case of diminishing returns (except maybe for large diagnostics files, or traces, but those are not expected to be downloaded frequently).
Areas where this makes a perceptible difference:
- Anything that subscribes to entities (Initial subscribe_entities payload)
- Initial download of registries on first connection / restore
- History queries
- Saving states to the database
- Large logbook queries
- Anything that subscribes to events (appdaemon)
Cavets:
orjson supports serializing dataclasses natively (and much faster) which
eliminates the need to implement `as_dict` in many places
when the data is already in a dataclass. This works
well as long as all the data in the dataclass can also
be serialized. I audited all places where we have an `as_dict`
for a dataclass and found only backups needs to be adjusted (support for `Path` needed to be added for backups). I was a little bit worried about `SensorExtraStoredData` with `Decimal` but it all seems to work out from since it converts it before it gets to the json encoding cc @dgomes
If it turns out to be a problem we can disable this
with option |= [orjson.OPT_PASSTHROUGH_DATACLASS](https://github.com/ijl/orjson#opt_passthrough_dataclass) and it
will fallback to `as_dict`
Its quite impressive for history queries
<img width="1271" alt="Screen_Shot_2022-05-30_at_23_46_30" src="https://user-images.githubusercontent.com/663432/171145699-661ad9db-d91d-4b2d-9c1a-9d7866c03a73.png">
* use for views as well
* handle UnicodeEncodeError
* tweak
* DRY
* DRY
* not needed
* fix tests
* Update tests/components/http/test_view.py
* Update tests/components/http/test_view.py
* black
* templates
* Separate recorder database schema from other classes
* fix logbook imports
* migrate new tests
* few more
* last one
* fix merge
Co-authored-by: J. Nick Koston <nick@koston.org>