Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Expression too long error while finding synthetic control

    Dear all,

    I am using Stata 15.0 to find a synthetic control group using synth command. No matter how short my expression is, I receive the message "Expression too long" r(130).

    The dataset consists of 540 time periods and 83 units (44,820 observations overall).

    My shortest try was:

    Code:
    . synth ln_births urbanization, trunit(36) trperiod(510) counit(1 7)
    
    ------------------------------------------------------------------------------------------------------------------
    Synthetic Control Method for Comparative Case Studies
    ------------------------------------------------------------------------------------------------------------------
    
    First Step: Data Setup
    ------------------------------------------------------------------------------------------------------------------
    expression too long
    r(130);
    What can be the problem?

    Thank you!
    Last edited by Pavel Jelnov; 24 Oct 2019, 10:33.

  • #2
    Dear Pavel,

    I am having the same problem currently. Did you find a solution for your issue?

    Thanks

    Comment


    • #3
      use -trace- to find the problem (you may also need to set tracedepth); see
      Code:
      help trace

      Comment


      • #4
        Welcome to Statalist. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. Being able to replicate your problem can be essential to helping you. Apparently the problem has something to do with the data, do you have missing values?

        Comment


        • #5
          I have an idea for what is going on that is easy to verify. You have too many pre-treatment periods. Drop the first 260 periods for all units and try your code again to test this hypothesis.

          Here's an example demonstrating the problem:

          Code:
          copy "https://web.stanford.edu/~jhain/Synth/smoking.dta" "smoking.dta", replace
          use "smoking.dta", clear
          xtset state year
          
          /* This works great */
          synth cigsale beer, trunit(3) trperiod(1989)
          
          /* Let's lengthen the pre intervention history all the way to 1722 , so we have 267 pre periods */
          expand 9, gen(copy)
          bys state year: replace year = year - 31*(_n-1)
          xtdescribe
          // set trace on
          synth cigsale beer, trunit(3) trperiod(1989)
          If you -set trace on- you can see Stata fails in the reducesample subroutine where it trying to creates a subsample marker for specified periods and units
          Code:
          if inlist(year,1722,1723,1724,1725,1726,1727,1728,1729,1730,1731,1732,1733,1734,1735,1736,1737,1738,1739,1740,1741,1742,1743,1744,1745,1746,1747,1748,1749,1750,1751,1752,1753,1754,1755,1756,1757,1758,1759,1760,1761,1762,1763,1764,1765,1766,1767,1768,1769,1770,1771,1772,1773,1774,1775,1776,1777,1778,1779,1780,1781,1782,1783,1784,1785,1786,1787,1788,1789,1790,1791,1792,1793,1794,1795,1796,1797,1798,1799,1800,1801,1802,1803,1804,1805,1806,1807,1808,1809,1810,1811,1812,1813,1814,1815,1816,1817,1818,1819,1820,1821,1822,1823,1824,1825,1826,1827,1828,1829,1830,1831,1832,1833,1834,1835,1836,1837,1838,1839,1840,1841,1842,1843,1844,1845,1846,1847,1848,1849,1850,1851,1852,1853,1854,1855,1856,1857,1858,1859,1860,1861,1862,1863,1864,1865,1866,1867,1868,1869,1870,1871,1872,1873,1874,1875,1876,1877,1878,1879,1880,1881,1882,1883,1884,1885,1886,1887,1888,1889,1890,1891,1892,1893,1894,1895,1896,1897,1898,1899,1900,1901,1902,1903,1904,1905,1906,1907,1908,1909,1910,1911,1912,1913,1914,1915,1916,1917,1918,1919,1920,1921,1922,1923,1924,1925,1926,1927,1928,1929,1930,1931,1932,1933,1934,1935,1936,1937,1938,1939,1940,1941,1942,1943,1944,1945,1946,1947,1948,1949,1950,1951,1952,1953,1954,1955,1956,1957,1958,1959,1960,1961,1962,1963,1964,1965,1966,1967,1968,1969,1970,1971,1972,1973,1974,1975,1976,1977,1978,1979,1980,1981,1982,1983,1984,1985,1986,1987,1988) & state==1
          inlist() can only have between 1 and 250 arguments after the first one, so the code breaks.

          Of course, you could be running into some other r(130) problem, but I bet this is it. You can also have too many post periods, but that is rarer than hens' teeth for most research subjects that rely on -synth-.

          Comment


          • #6
            Originally posted by Pavel Jelnov View Post
            Dear all,

            I am using Stata 15.0 to find a synthetic control group using synth command. No matter how short my expression is, I receive the message "Expression too long" r(130).

            The dataset consists of 540 time periods and 83 units (44,820 observations overall).

            My shortest try was:

            Code:
            . synth ln_births urbanization, trunit(36) trperiod(510) counit(1 7)
            
            ------------------------------------------------------------------------------------------------------------------
            Synthetic Control Method for Comparative Case Studies
            ------------------------------------------------------------------------------------------------------------------
            
            First Step: Data Setup
            ------------------------------------------------------------------------------------------------------------------
            expression too long
            r(130);
            What can be the problem?

            Thank you!
            Hi
            I am also dealing with the same issue "expression is too long".

            I have a total of 416 time periods (week) and 32 units (states; with one control unit).
            I am using Stata 17 SE version and if I split my data into two parts according to time periods (<250 and >250), then Stata can run the command but in this case one group does not contain intervention time (for example- intervention time period is 116, so 1st dataset (time period <250 contains intervention time) contains intervention time but second dataset (time period >250) does not contain intervention period).

            I am wondering whether is there any way to overcome expression too long problem and run the command for full dataset, or it will not be possible to run.

            Command: synth_runner Wkly_Ntwknd_incdnt Covid1Wkly_Ntwknd_incdnt frstWkly_Ntwknd_incdnt otlrWkly_Ntwknd_incdnt, trunit(15) trperiod(116) gen_vars

            Comment

            Working...
            X