Edit: Using Stata 14.2.
Long story short, I'm running a logit on aggregated data and when I attempt to run margins in order to set up predicted effects plots the predicted value are ludicrously large and beyond [0,1]. This only happens when doing grouped logit, so not sure if this is unique to group logit.
Longer story is I have a dataset that consists of candidates, the total number of ads they've run, and the total number of ads with a certain characteristic. I'm investigating the proportion of a candidate's ads include a characteristic (ex: mentioning ideology) based on district competitiveness. Here is an example of the dataset, restricted to the variables most relevant to the question I'm asking:
While in broader paper I'm just using OLS because of other things, I wanted to include an appendix showing the results are robust to using logit since this is a [0,1] type of DV. Based on what I've understood, this is basically logit with aggregated data, where candidate is the "group", number of trials is the total number of ads aired, and the number of success is the number of ads with candidates. So, I run my group logit in Stata using the advice provided here: https://www.stata.com/support/faqs/s...-grouped-data/ . This results in code that looks like this:
And produces the following output (just showing coefficients, though can add the rest late if needed):
Logit coefficients are hard to understand, so I decide I want to do a predicted effects plot using margins as I usually do. So I set up the margins command as followed:
Which is where the problems occur:
As you can see, those estimates are clearly out of the [0,1] range. I've run the model using OLS and just a normal logit (I have the proportions calculated as well but that's obviously not appropriate), and margins run perfectly fine and gives results between [0,1]. Granted, OLS does produce some results below 0 at the extreme values of in-party vote, but that's OLS for you. It's only with grouped logit that the effects are extreme. I even calculated by hand by multiplying the coefficients with the assigned values of my IVs: they did not equal the predicted margins.
My only thought is, since this is only with this model, is it for some reason converting the predicted margins back into "total number of successes" somehow? I really just can't think of any reason I should be getting such high numbers for marginal effects.
Long story short, I'm running a logit on aggregated data and when I attempt to run margins in order to set up predicted effects plots the predicted value are ludicrously large and beyond [0,1]. This only happens when doing grouped logit, so not sure if this is unique to group logit.
Longer story is I have a dataset that consists of candidates, the total number of ads they've run, and the total number of ads with a certain characteristic. I'm investigating the proportion of a candidate's ads include a characteristic (ex: mentioning ideology) based on district competitiveness. Here is an example of the dataset, restricted to the variables most relevant to the question I'm asking:
Code:
+-------------------------+-------------+-------------+ | cand_id | total_ideol | total_aired | +-------------------------+-------------+-------------+ | "BELK_JUDY_MCCAIN_2002" | 0 | 1010 | +-------------------------+-------------+-------------+ | "BONNER_2006" | 0 | 477 | +-------------------------+-------------+-------------+ | "BELK_JUDY_MCCAIN_2004" | 0 | 245 | +-------------------------+-------------+-------------+ | "BECKERLE_2006" | 0 | 169 | +-------------------------+-------------+-------------+ | "BONNER_JO_2004" | 0 | 1126 | +-------------------------+-------------+-------------+ | "BONNER_JO_2002" | 0 | 414 | +-------------------------+-------------+-------------+ | "BYRNE_BRADLEY_2014" | 223 | 223 | +-------------------------+-------------+-------------+ | "ROGERS_MIKE_2002" | 1757 | 3508 | +-------------------------+-------------+-------------+ | "SEGALL_JOSH_2008" | . | 1379 | +-------------------------+-------------+-------------+ | "ROGERS_MIKE_2008" | . | 1638 | +-------------------------+-------------+-------------+ | "ROGERS_MIKE_2004" | 0 | 389 | +-------------------------+-------------+-------------+ | "TURNHAM_JOE_2002" | 0 | 2207 | +-------------------------+-------------+-------------+ | "ADERHOLT_2000" | 320 | 1449 | +-------------------------+-------------+-------------+ | "FOLSOM_2000" | 0 | 1053 | +-------------------------+-------------+-------------+ | "ADERHOLT_ROBERT_2012" | 0 | 1 | +-------------------------+-------------+-------------+ | "BACHUS_2000" | 0 | 136 | +-------------------------+-------------+-------------+ | "LESTER_MARK_2014" | 0 | 183 | +-------------------------+-------------+-------------+ | "PALMER_GARY_2014" | 0 | 242 | +-------------------------+-------------+-------------+ | "BAILEY_PENNY_2012" | 0 | 35 | +-------------------------+-------------+-------------+ | "RENZI_RICK_2002" | 0 | 2034 | +-------------------------+-------------+-------------+
Code:
glm total_ideol ip_margin mrp_mean party incumbent open female /// dime_score log_spend vote_share opponent_neg, vce(cluster statdist_cen) /// link(logit) family(binomial total_air)
Code:
+----------------+-----------+--------------------+-------+----------+----------------------+ | total_ideol | | Coef. | Robust Std. Errors | z | P>|z| | [95% Conf. Interval] | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | ip_margin | | 1.214657 | .6524683 | 1.86 | 0.063 | -.0641568 | 2.493472 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | mrp_mean | | 1.147326 | .4847957 | 2.37 | 0.018 | .1971439 | 2.097508 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | party | | .4515572 | .3877767 | 1.16 | 0.244 | -.3084712 | 1.211586 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | incumbent | | -.0343667 | .1807825 | -0.19 | 0.849 | -.388694 | .3199605 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | open | | .4163194 | .2065463 | 2.02 | 0.044 | .0114962 | .8211426 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | female | | -.4038649 | .2619799 | -1.54 | 0.123 | -.9173362 | .1096063 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | dime_score | | .123921 | .1910771 | 0.65 | 0.517 | -.2505833 | .4984253 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | log_spend | | -.3341911 | .1872252 | -1.78 | 0.074 | -.7011457 | .0327635 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | vote_share | | -.7543874 | 1.313943 | -0.57 | 0.566 | -3.329668 | 1.820893 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | opponent_neg | | .0146609 | .2361849 | 0.06 | 0.951 | -.448253 | .4775748 | +----------------+-----------+--------------------+-------+----------+-----------+----------+ | _cons | | -.8380754 | 1.192996 | -0.70 | 0.482 | -3.176305 | 1.500154 | +----------------+-----------+--------------------+-------+----------+-----------+----------+
Code:
margins, at(ip_margin=(-.3(.1).3) /// (mean) log_spend vote_share opponent_neg dime_score mrp_mean /// female=0 incumbent=0 open=0 party=1)
Code:
+--------+-----------------+----------+--------+------------+-----------+----------+ | Margin | | z | P>z | [95% Conf. | Interval] | | | | Delta-Method | | | | | | | | Standard Errors | | | | | | +--------+-----------------+----------+--------+------------+-----------+----------+ | _at | | | | | | | +--------+-----------------+----------+--------+------------+-----------+----------+ | 1 | 42.84476 | 12.86015 | 3.33 | 0.001 | 17.63933 | 68.05018 | +--------+-----------------+----------+--------+------------+-----------+----------+ | 2 | 48.11459 | 12.41373 | 3.88 | 0.000 | 23.78413 | 72.44505 | +--------+-----------------+----------+--------+------------+-----------+----------+ | 3 | 53.99641 | 12.20703 | 4.42 | 0.000 | 30.07108 | 77.92174 | +--------+-----------------+----------+--------+------------+-----------+----------+ | 4 | 60.55199 | 12.62734 | 4.80 | 0.000 | 35.80286 | 85.30112 | +--------+-----------------+----------+--------+------------+-----------+----------+ | 5 | 67.84699 | 14.13677 | 4.80 | 0.000 | 40.13943 | 95.55455 | +--------+-----------------+----------+--------+------------+-----------+----------+ | 6 | 75.95055 | 17.07385 | 4.45 | 0.000 | 42.48643 | 109.4147 | +--------+-----------------+----------+--------+------------+-----------+----------+ | 7 | 84.93473 | 21.55822 | 3.94 | 0.000 | 42.68139 | 127.1881 | +--------+-----------------+----------+--------+------------+-----------+----------+
My only thought is, since this is only with this model, is it for some reason converting the predicted margins back into "total number of successes" somehow? I really just can't think of any reason I should be getting such high numbers for marginal effects.
Comment