ankane / active_median Goto Github PK
View Code? Open in Web Editor NEWMedian and percentile for Active Record, Mongoid, arrays, and hashes
License: MIT License
Median and percentile for Active Record, Mongoid, arrays, and hashes
License: MIT License
I have a table called Order
and I'm trying to determine all 100 percentiles for Order.weight
.
Here's my code (apologies if there's a much cleaner way to do this!):
100.times do |p|
weight_percentiles << Order.percentile(:weight_lb, p/100.0)
end
When I run this code, I get:
(0.7ms) SELECT PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY "orders"."weight_lb") AS percentile_weight_lb FROM "orders"
/Users/me/.rvm/gems/ruby-3.1.2/gems/rack-mini-profiler-2.3.4/lib/patches/db/pg.rb:69:in `exec_params': PG::UndefinedFunction: ERROR: function percentile_cont(numeric, character varying) does not exist (ActiveRecord::StatementInvalid)
LINE 1: SELECT PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY "order...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
/Users/me/.rvm/gems/ruby-3.1.2/gems/rack-mini-profiler-2.3.4/lib/patches/db/pg.rb:69:in `exec_params': ERROR: function percentile_cont(numeric, character varying) does not exist (PG::UndefinedFunction)
LINE 1: SELECT PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY "order...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
Any idea how to resolve this?
Please create a new issue to discuss any ideas or share your own.
mode
approximate
branchWhen running the command ActiveMedian.create_function
I get the following error:
ActiveRecord::StatementInvalid: Mysql2::Error: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'FUNCTION median(anyarray)
Using mysql2 gem. Anything I should be doing beforehand?
When using with groupdate, the median
method results in keys that are time objects, whereas with average
, the keys are date objects. This then results in the chartkick labels showing times ๐ข
query.group_by_period(:month, :submitted_at).average(:time)
=> {Sun, 01 Apr 2018=>0.128564313333333333e6, Tue, 01 May 2018=>0.180389867753623188e6, Fri, 01 Jun 2018=>0.185993985959438378e6, Sun, 01 Jul 2018=>0.168664401069518717e6, Wed, 01 Aug 2018=>0.209151385899814471e6, Sat, 01 Sep 2018=>0.140354274944567627e6}
query.group_by_period(:month, :submitted_at).median(:time)
=> {2018-04-01 00:00:00 UTC=>46613.5, 2018-05-01 00:00:00 UTC=>51045.5, 2018-06-01 00:00:00 UTC=>26603.0, 2018-07-01 00:00:00 UTC=>39185.0, 2018-08-01 00:00:00 UTC=>15291.0, 2018-09-01 00:00:00 UTC=>24202.0}
I have a has-many association with DISTINCT ON select.
has_many :active_offers, -> {
select("DISTINCT ON (region_id, brand_id) *")
.order(:region_id, :brand_id, created_at: :desc)
}, class_name: 'Offer'
And as a result, Brand.active_offers.median(:price) throws an error
PG::GroupingError: ERROR: column "offers.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT DISTINCT ON (region_id, brand_id) *, PERCENTILE_CONT(...
^
Generated query from logs:
SELECT DISTINCT ON (region_id, brand_id) *, PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY "offers"."price") AS median_price FROM "offers" WHERE "offers"."fuel_id" = 1
This is a great gem... I want to first say thank you for it. Medians are usually so much more appropriate than averages. Perhaps some day everyone else will realize that. ๐
Anyway, I get an error when running my tests:
ActiveRecord::StatementInvalid: ActiveRecord::StatementInvalid: PG::UndefinedFunction: ERROR: function median(integer) does not exist
While checking out your active median, I noticed one missing feature that I would like. When doing percentile calculations, it's often across a set of percentiles. Something like Model.percentile(:value, [0.1, 0.25, 0.75, 0.9])
. I know this feature is supported in postgres
percentile_cont(fractions) WITHIN GROUP (ORDER BY sort_expression)
multiple continuous percentile: returns an array of results matching the shape of the fractions parameter, with each non-null element replaced by the value corresponding to that percentile
It looks like active_median is not compatible with the latest Rails 3.2 release. Are there plans to update the gem? Thanks!
Wanted to share something I ran into.
Ran into a problem with the function not existing on subsequent runs of my CI and/or after a rails db:reset
call. This is because the create function isn't stored in the schema.rb.
You can switch to :sql
as your schema structure, or you can make sure to run:
bundle exec rails runner "ActiveMedian.create_function"
every time you reset your database.
Make sure to run RAILS_ENV=test bundle exec rails runner "ActiveMedian.create_function"
too if you need it for a test database (like CI).
It appears ActiveMedian is not generating correct sql when includes
and references
are used. The following test fails on current master:
def test_references
user = User.create!
user.posts.create!(comments_count: 1)
assert_equal 1, User.includes(:posts).references(:posts).median(:comments_count)
end
1) Error:
TestActiveMedian#test_references:
ActiveRecord::StatementInvalid: PG::GroupingError: ERROR: column "users.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: ...ONT(0.50) WITHIN GROUP (ORDER BY comments_count), "users"."i...
^
: SELECT PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY comments_count), "users"."id" AS t0_r0, "users"."visits_count" AS t0_r1, "users"."latitude" AS t0_r2, "users"."rating" AS t0_r3, "users"."name" AS t0_r4, "users"."created_at" AS t0_r5, "posts"."id" AS t1_r0, "posts"."user_id" AS t1_r1, "posts"."comments_count" AS t1_r2 FROM "users" LEFT OUTER JOIN "posts" ON "posts"."user_id" = "users"."id"
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:611:in `async_exec'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:611:in `block (2 levels) in exec_no_cache'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.3/lib/active_support/dependencies/interlock.rb:48:in `block in permit_concurrent_loads'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.3/lib/active_support/concurrency/share_lock.rb:187:in `yield_shares'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.3/lib/active_support/dependencies/interlock.rb:47:in `permit_concurrent_loads'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:610:in `block in exec_no_cache'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/abstract_adapter.rb:581:in `block (2 levels) in log'
/usr/lib/ruby/2.5.0/monitor.rb:226:in `mon_synchronize'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/abstract_adapter.rb:580:in `block in log'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activesupport-5.2.3/lib/active_support/notifications/instrumenter.rb:23:in `instrument'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/abstract_adapter.rb:571:in `log'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:609:in `exec_no_cache'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql_adapter.rb:596:in `execute_and_clear'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/postgresql/database_statements.rb:81:in `exec_query'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/abstract/database_statements.rb:478:in `select'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/abstract/database_statements.rb:70:in `select_all'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/connection_adapters/abstract/query_cache.rb:106:in `select_all'
/tmp/active_median/lib/active_median/model.rb:37:in `median'
/tmp/active_median/lib/active_median/enumerable.rb:5:in `block in median'
/tmp/active_median/vendor/bundle/ruby/2.5.0/gems/activerecord-5.2.3/lib/active_record/relation.rb:281:in `scoping'
/tmp/active_median/lib/active_median/enumerable.rb:5:in `median'
/tmp/active_median/test/active_median_test.rb:69:in `test_references'
It appears all of the columns are added after the WITHIN GROUP
part.
Thanks for this awesome gem.
Thanks for the great gem! I'm not sure if this is specific to my usage, but all of my active_median
-related tests break when I upgrade Rails to 6.1.0
. Specifically here:
active_median/lib/active_median/model.rb
Line 75 in e453f62
NoMethodError:
undefined method `cast_value' for nil:NilClass
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:80:in `block (2 levels) in percentile'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:77:in `each'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:77:in `each_with_index'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:77:in `each'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:77:in `map'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:77:in `block in percentile'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:76:in `each'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:76:in `percentile'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/model.rb:4:in `median'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/enumerable.rb:5:in `block in median'
# ./.bundle/gems/active_median-0.2.5/lib/active_median/enumerable.rb:5:in `median'
Adding some debugging, result.column_types
looks like this under v6.0.3.4
:
{"repository_id"=>#<ActiveRecord::ConnectionAdapters::PostgreSQL::OID::Uuid:0x00007fd5d779b5a8 @precision=nil, @scale=nil, @limit=nil>, "percentile_cont"=>#<ActiveModel::Type::Float:0x00007fd5d77a0828 @precision=nil, @scale=nil, @limit=nil>}
After upgrading to 6.1.0
, the entry for percentile_cont
is missing, and result.column_types
looks like this:
{"repository_id"=>#<ActiveRecord::ConnectionAdapters::PostgreSQL::OID::Uuid:0x00007fd05660c7e0 @precision=nil, @scale=nil, @limit=nil>}
I'm still trying to narrow things down further, but I was wondering if you had any ideas @ankane ?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.