Julia back end integrated into boottest

David Roodman

Join Date: Jul 2014

Posts: 464
#1

Julia back end integrated into boottest

22 Mar 2022, 07:42

In December I posted about a new Julia implementation of the wild bootstrap and demonstrated how to call it from Stata...by way of Python. In computationally demanding applications, it can be much faster.

JanDitzen made the excellent suggestion that I add a julia option to boottestto automate this link, which I have now done. It still requires (free) installation of both Python and Julia. (Julia must be installed so that it is accessible through the system path.). boottest tries to automate the set-up from there, installing Python and Julia packages as needed. Let me know if that part crashes on you. The latest version of boottestcan be installed with ssc install boottest, replace. Because it uses Python to connect to Julia, it requires Stata 16 or later.

The first time the Julia version is called in a Stata session, or called to run a new kind of test, such as the WCR after regress or the WRE after ivregress, there will be a significant delay as Julia compiles code "just in time." For the rest of the session, it should run fast. But if the regular version of boottest already is fast in your applications, the Julia version will not save time.

There is also a float(32) option, which tells the Julia implementation to use single-precision math, and often returns the same answer in less time. When generating random numbers in Julia, the StableRNGs.jl package is used in order to guarantee exact replicability.

I think this project is interesting as a model of unified cross-platform development. A complex procedure can be implemented and refined once, instead of de novo for R, Stata, Python, etc. And it can still be done in a platform-independent way, unlike C plug-ins for Stata, which the author has to compile separately for Windows, Linux, and Mac. Alexander Fischer is incorporating the same Julia back end, WildBootTests.jl, into fwildclusterboot for R.

If Stata Corp integrated Julia into Stata the way it has Python, then this model could become even more viable for Stata. I've worked to automate the Julia set-up, but because of the current need to link via Python and the complexity of supporting multiple operating systems in multiple configurations, I worry that it will sometimes fail, and would benefit from a professional touch.

And if Julia got to the point where the same just-(not)-in-time compilation didn't need to be run anew in every session, that would be even better...

Here is an annotated session performing a subcluster wild bootstrap on a data set from Tim Conley. set rmsg is turned on to show how many seconds each command takes:

Code:

. infile coll merit male black asian year state chst using regm.raw, clear (42,161 observations read) . . generate individual = _n // unique ID for each observation . . qui xi: regress coll merit male black asian i.year i.state, cluster(state) . set rmsg on r; t=0.00 8:13:58 . boottest merit, nogr reps(9999) bootcluster(individual) // without julia option Wild bootstrap-t, null imposed, 9999 replications, Wald test, bootstrap clustering by individual, Rademacher weights: merit t(50) = 2.6646 Prob>|t| = 0.0352 95% confidence set for null hypothesis expression: [.002728, .06488] r; t=21.94 8:15:06 . boottest merit, nogr reps(9999) bootcluster(individual) julia // with julia option: slow the first time called in a Stata session Wild bootstrap-t, null imposed, 9999 replications, Wald test, bootstrap clustering by individual, Rademacher weights: merit t(50) = 2.6646 Prob>|t| = 0.0319 95% confidence set for null hypothesis expression: [.003473, .06414] r; t=28.21 8:15:46 . boottest merit, nogr reps(9999) bootcluster(individual) julia // same command, faster now Wild bootstrap-t, null imposed, 9999 replications, Wald test, bootstrap clustering by individual, Rademacher weights: merit t(50) = 2.6646 Prob>|t| = 0.0319 95% confidence set for null hypothesis expression: [.003804, .06383] r; t=3.55 8:15:53 . boottest merit, nogr reps(9999) bootcluster(individual) julia float(32) // slow first time called in single precision Wild bootstrap-t, null imposed, 9999 replications, Wald test, bootstrap clustering by individual, Rademacher weights: merit t(50) = 2.6646 Prob>|t| = 0.0295 95% confidence set for null hypothesis expression: [.004034, .0636] r; t=15.78 8:16:15 . boottest merit, nogr reps(9999) bootcluster(individual) julia float(32) // second time, even faster Wild bootstrap-t, null imposed, 9999 replications, Wald test, bootstrap clustering by individual, Rademacher weights: merit t(50) = 2.6646 Prob>|t| = 0.0305 95% confidence set for null hypothesis expression: [.003753, .06379] r; t=2.63 8:16:21

Last edited by David Roodman; 22 Mar 2022, 08:08.
Tags: None

2 likes

Raymond Zhang

Join Date: Jan 2021
Posts: 349

14 Apr 2022, 23:07

Code:

. boottest, seed(123) weight(webb) nogr reps(1000) julia

Could not automatically initialize the PyJulia package.
Perhaps these instructions will help.
r(198);

Best regards.

Raymond Zhang
Stata 17.0,MP

Comment

David Roodman

Join Date: Jul 2014

Posts: 464
#3

15 Apr 2022, 08:37

Did you try clicking on "these instructions"?
Comment
Arjan Trinks

Join Date: Sep 2016

Posts: 11
#4

05 Dec 2022, 09:57

David Roodman I want to use the amazing Julia option in boottest, to speed up a WCR-R bootstrap on a very large dataset.
I followed all the installation steps in the boottest Stata-manual to get the Julia option up and running.
Unfortunately, I seem to get various errors:

I used the example dataset "use https://raw.githubusercontent.com/droodman/boottest/master/data/collapsed" all the time.
And ensured that the latest version of boottest has been installed:

Code:

ssc install boottest, replace which boottest

C:\Users\pjt\ado\plus\b\boottest.ado
*! boottest 4.3.0 17 November 2022
*! Copyright (C) 2015-22 David Roodman
*! Version history at bottom

Code:

use https://raw.githubusercontent.com/droodman/boottest/master/data/collapsed regress hasinsurance selfemployed post post_self, cluster(year) boottest post_self=.04 // wild bootstrap, Rademacher weights, null imposed, 999 replications boottest post_self=.04, julia // wild bootstrap, Rademacher weights, null imposed, 999 replications, with Julia option activated

When running this code on my machine, I got the following error messages:
First I got the error 'Could not automatically install the PyJulia package'
After manually installing the PyJulia package (following the instructions in the boottest manual), I still got the error 'Could not automatically initialize the PyJulia package'

Somebody from my IT department tried to make things work by adjusting something, and now Julia actually did start up. But after the Julia window opens up, I seem to get an error in the Python layer between Stata and Julia. I got the error below.
I'm wondering how to solve this, and how to proceed. When I again tried to run boottest with julia option this afternoon, on the example above (boottest post_self=.04, julia), Stata closes down entirely...

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ArithmeticError: <PyCall.jlwrap (in a Julia function called from Python)
JULIA: InexactError: trunc(Int64, NaN)
Stacktrace:
[1] trunc
@ .\float.jl:805 [inlined]
[2] floor
@ .\float.jl:367 [inlined]
[3] plot!(o::WildBootTests.StrBootTest{Float64})
@ WildBootTests m:\p_Juliagebruiker\ant\packages\WildBootTests\cMX rF\src\plot
> -CI.jl:129
[4] __wildboottest(R::Matrix{Float64}, r::Vector{Float64}; resp::Vector{Float64
> }, predexog::Matrix{Float64}, predendog::Matrix{Float64}, inst::Matrix{Float64
> }, R1::Matrix{Float64}, r1::Vector{Float64}, clustid::Matrix{Int64}, nbootclus
> tvar::Int64, nerrclustvar::Int64, issorted::Bool, hetrobust::Bool, nfe::Int64,
> feid::Vector{Int64}, fedfadj::Int64, obswt::Vector{Float64}, fweights::Bool,
> maxmatsize::Float16, ptype::Symbol, bootstrapc::Bool, liml::Bool, fuller::Floa
> t64, kappa::Float64, arubin::Bool, small::Bool, clusteradj::Bool, clustermin::
> Bool, jk::Bool, scorebs::Bool, reps::Int64, imposenull::Bool, auxwttype::Symbo
> l, rng::StableRNGs.LehmerRNG, level::Float64, rtol::Float64, madjtype::Symbol,
> nH0::Int16, ml::Bool, scores::Matrix{Float64}, beta::Vector{Float64}, A::Line
> arAlgebra.Symmetric{Float64, Matrix{Float64}}, gridmin::Vector{Float64}, gridm
> ax::Vector{Float64}, gridpoints::Vector{Float64}, diststat::Symbol, getci::Boo
> l, getplot::Bool, getauxweights::Bool)
@ WildBootTests m:\p_Juliagebruiker\ant\packages\WildBootTests\cMX rF\src\inte
> rface.jl:141
[5] _wildboottest(T::DataType, R::Matrix{Float64}, r::Matrix{Float64}; resp::Ve
> ctor{Float64}, predexog::Matrix{Float64}, predendog::Matrix{Float64}, inst::Ma
> trix{Float64}, R1::Matrix{Float64}, r1::Vector{Float64}, hetrobust::Bool, clus
> tid::Vector{Int64}, nbootclustvar::Int64, nerrclustvar::Int64, issorted::Bool,
> nfe::Int64, feid::Matrix{Int64}, fedfadj::Int64, obswt::Matrix{Float64}, fwei
> ghts::Bool, maxmatsize::Int64, ptype::Symbol, bootstrapc::Bool, liml::Bool, fu
> ller::Int64, kappa::Float64, arubin::Bool, small::Bool, clusteradj::Bool, clus
> termin::Bool, jk::Bool, scorebs::Bool, reps::Int64, imposenull::Bool, auxwttyp
> e::Symbol, rng::StableRNGs.LehmerRNG, level::Int64, rtol::Float64, madjtype::S
> ymbol, nH0::Int64, ml::Bool, scores::Matrix{Float64}, beta::Matrix{Float64}, A
> ::Matrix{Float64}, gridmin::Vector{Float64}, gridmax::Vector{Float64}, gridpoi
> nts::Vector{Float64}, diststat::Symbol, getci::Bool, getplot::Bool, getauxweig
> hts::Bool)
@ WildBootTests m:\p_Juliagebruiker\ant\packages\WildBootTests\cMX rF\src\inte
> rface.jl:268
[6] wildboottest(T::Type, R::Matrix{Float64}, r::Matrix{Float64}; kwargs::Base.
> Pairs{Symbol, Any, NTuple{44, Symbol}, NamedTuple{(:arubin, :obswt, :getplot,
> :resp, :scores, :getauxweights, :ptype, :gridmin, :liml, :madjtype, :gridmax,
> :getci, :clustid, :fuller, :nbootclustvar, :auxwttype, :predexog, :hetrobust,
> :rtol, :imposenull, :nH0, :scorebs, :A, :R1, :rng, :reps, :diststat, :bootstra
> pc, :beta, :inst, :jk, :r1, :fweights, :nfe, :feid, :maxmatsize, :level, :ml,
> :fedfadj, :gridpoints, :small, :nerrclustvar, :predendog, :issorted), Tuple{Bo
> ol, Matrix{Float64}, Bool, Vector{Float64}, Matrix{Float64}, Bool, String, Vec
> tor{Float64}, Bool, String, Vector{Float64}, Bool, Vector{Int64}, Int64, Int64
> , String, Matrix{Float64}, Bool, Float64, Bool, Int64, Bool, Matrix{Float64},
> Matrix{Float64}, StableRNGs.LehmerRNG, Int64, String, Bool, Matrix{Float64}, M
> atrix{Float64}, Bool, Vector{Float64}, Bool, Int64, Matrix{Int64}, Int64, Int6
> 4, Bool, Int64, Vector{Float64}, Bool, Int64, Matrix{Float64}, Bool}}})
@ WildBootTests m:\p_Juliagebruiker\ant\packages\WildBootTests\cMX rF\src\inte
> rface.jl:402
[7] invokelatest(::Any, ::Any, ::Vararg{Any}; kwargs::Base.Pairs{Symbol, Any, N
> Tuple{44, Symbol}, NamedTuple{(:arubin, :obswt, :getplot, :resp, :scores, :get
> auxweights, :ptype, :gridmin, :liml, :madjtype, :gridmax, :getci, :clustid, :f
> uller, :nbootclustvar, :auxwttype, :predexog, :hetrobust, :rtol, :imposenull,
> :nH0, :scorebs, :A, :R1, :rng, :reps, :diststat, :bootstrapc, :beta, :inst, :j
> k, :r1, :fweights, :nfe, :feid, :maxmatsize, :level, :ml, :fedfadj, :gridpoint
> s, :small, :nerrclustvar, :predendog, :issorted), Tuple{Bool, Matrix{Float64},
> Bool, Vector{Float64}, Matrix{Float64}, Bool, String, Vector{Float64}, Bool,
> String, Vector{Float64}, Bool, Vector{Int64}, Int64, Int64, String, Matrix{Flo
> at64}, Bool, Float64, Bool, Int64, Bool, Matrix{Float64}, Matrix{Float64}, Sta
> bleRNGs.LehmerRNG, Int64, String, Bool, Matrix{Float64}, Matrix{Float64}, Bool
> , Vector{Float64}, Bool, Int64, Matrix{Int64}, Int64, Int64, Bool, Int64, Vect
> or{Float64}, Bool, Int64, Matrix{Float64}, Bool}}})
@ Base .\essentials.jl:718
[8] _pyjlwrap_call(f::Function, args_::Ptr{PyCall.PyObject_struct}, kw_::Ptr{Py
> Call.PyObject_struct})
@ PyCall m:\p_Juliagebruiker\ant\packages\PyCall\ygXW2\src\ callback.jl:32
[9] pyjlwrap_call(self_::Ptr{PyCall.PyObject_struct}, args_::Ptr{PyCall.PyObjec
> t_struct}, kw_::Ptr{PyCall.PyObject_struct})
@ PyCall m:\p_Juliagebruiker\ant\packages\PyCall\ygXW2\src\ callback.jl:44>
r(7102);

<<<<<<

Last edited by Arjan Trinks; 05 Dec 2022, 09:59.
Comment
David Roodman

Join Date: Jul 2014

Posts: 464
#5

05 Dec 2022, 11:08

Arjan Trinks Thank you for the feedback. This is a complicated business, unfortunately. Something that could help: can you try the Julia and Python examples displayed at https://github.com/droodman/WildBootTests.jl, i.e., by pasting them directly into the Julia and Python front-ends, outside of Stata? Would be interested if either or both work.

Last edited by David Roodman; 05 Dec 2022, 11:12.
Comment
Arjan Trinks

Join Date: Sep 2016

Posts: 11
#6

07 Dec 2022, 04:01

David Roodman Many thanks for pointing to this solution! It eventually worked out! I now managed to get things running from Stata directly, i.e. using the boottest command with 'julia' option.

The error seemed to originate from the use of the boottest package in Stata at our organization. I use Stata MP 17.0 on a server with 256GB working memory.
The problem mainly related to that our organization had Python 2 installed as standard, which apparently did not work together with the packages where boottest relies upon.
Additionally, some manual preparation of the Python/Julia environment was needed, as it did not work automatically.

To summarize, here were the steps I took. Possibly this will be relevant for somebody encountering a similar problem (although some issues might be specific to our organization):

1. follow the boottest manual (in Stata: h boottest) fully in installing everything needed for the Julia option.

2. try to run boottest outside Stata (see David's comment #5 above). In my case, I ran it:
--- from Julia => this worked
--- from Python via PyJulia => this worked
--- from Stata via boottest => this did NOT work
--- from Stata via Python and PyJulia => this worked.

3. ensure that Python 3 is employed in all the processes by running something like the following:
--- open Stata and run: python set exec d:\progs\anaconda3-test\python.exe , permanently
--- close down Stata
--- open Command prompt and run: asterminal
--- a window pops up titled ‘Terminal voor Batch jobs’
--- here, run the following:
activate d:\progs\anaconda3-test
python
import julia
julia.install()
exit()
exit
--- open Stata and run the boottest command with julia option on the example in the boottest manual. The first time it takes quite some time: Julia opens and closes down like 50x and various things are installed (WildBootTests.jl, StableRNGs.jl). But the second time is really really fast!
Comment
David Roodman

Join Date: Jul 2014

Posts: 464
#7

07 Dec 2022, 05:57

Thanks Arjan Trinks, this is extremely helpful. I will look into trying to make boottest detect or prevent some of these issues. Alas, it is very complicated, with different flavors of Python, Julia, Conda, different OS's, etc. This is why it would be great if Stata directly supported Julia.

To clarify item #2 for other users: use the Julia and Python examples at https://github.com/droodman/WildBootTests.jl, I assume "from Stata via Python and PyJulia" means typing "python" in Stata, and then running the Python example.

The Julia version will not always be faster. Data have to be copied from Stata to Python to Julia, which takes time. And there is an algorithmic difference in the bootstrap for instrumental variables estimation, which makes it faster when there are few clusters, but I fear backfires when there are many.
1 like
Comment
Arjan Trinks

Join Date: Sep 2016

Posts: 11
#8

13 Dec 2022, 04:15

Dear David Roodman, After the succes of running boottest+julia on the example dataset in the boottest manual, I've tried running boottest+julia on my own dataset.
First, I tried running boottest+julia on a 2% random subsample of my dataset. This works.

Second, I tried running on my full dataset. Here, Stata crashes down when running boottest+julia.
Probably this is due to the large size of the dataset (filesize about 40GB, N=40mln observations). There might be some failure in memory usage, as boottest+julia might lead both Stata and Julia to read everything into memory.
I work with Stata MP 17.0, on a server with in total 512GB of working memory. (16 cores).

I tried the following to ensure that Stata utilizes the maximum available memory usage (because by default it used only 1 core.. which seemed strange to me for Stata MP...):
Enforce that Stata uses all memory (all cores) by using the "parallel" command:

Code:

net install parallel, from(https://raw.github.com/gvegayon/parallel/stable/) replace mata mata mlib index parallel numprocessors parallel initialize 16 // enforce the use of 16 cores parallel do "try main results with julia".do

Result: this does not work, I get the following error message:

* Checking for break *
. mata: parallel_break()
r; t=0.00 14:40:39
. use "$data\dataset", clear
op. sys. refuses to provide memory
Stata's data-storage memory manager has already allocated 22176m bytes
and it just attempted to allocate another 32m bytes. The operating
system said no. Perhaps you are running another memory-consuming task
and the command will work later when the task completes. Perhaps you are
on a multiuser system that is especially busy and the command will work
later when activity quiets down. Perhaps a system administrator has put
a limit on what you can allocate; see help memory. Or perhaps that's all
the memory your computer can allocate to Stata.
r(909); t=0.29 14:40:39

I checked and this same error message does also appear when I do not run boottest at all, but instead running a simple regression (reg y x). So, using the "parallel" command in Stata MP doesn't seem to be a good idea.

2. Increase the standard memory usage in Stata, but letting Stata MP optimize the parallelization:

Code:

set segmentsize 1g, permanently // recommended for large datasets and 512GB working memory set niceness 1, permanently // "Niceness 10 corresponds to being totally nice. Niceness 0 corresponds to being an inconsiderate, self-centered, totally selfish jerk." set max_memory 512g, permanently set min_memory 400g, permanently // I set this somewhat lower to have some buffer in memory usage on the server (on which there may run other jobs).

Result: when running boottest+julia, the memory usage goes to about 200GB, but still Stata crashes after running for like 10 minutes (during which the Julia window opens up and closes down several times, and the "loading wheel" at the bottom right is still in motion)

3. Running boottest WITHOUT the julia option: this work but the running time is still very long for my analysis. Still, the above code to force a higher memory usage by Stata does speed up things somewhat.

I wonder what could cause the julia option not to work on my large dataset.. and what could be a solution?

A solution I though of was maybe to run the whole analysis in Julia; however, I doubt whether this would be possible since boottest in my case runs after a specific Stata regression package (ivreghdfe, which is based on ivreg2).
Specifically, I run:

Code:

ssc install boottest, replace // ensure to use the most recent version of boottest foreach v in $depvars { ivreghdfe `v' $indepvars_exog $t (indepvar_of_interest = iv_for_indepvar_of_interest), absorb($i) cluster(country sector) * This is a panel dataset with many firms (i) for about 10 years (t). * The variable of interest is measured at the country-sector-year level, so there is a need to cluster errors since the observations of different firms within the same country or sector are correlated. * There are 32 countries and 15 sectors, so there is a need to use (WRE) cluster bootstrapping. local dp = 4 // rounding the statistics reported in the results table to 4 digits. local dp_value = 10^-`dp' local idstat = round(`e(idstat)', `dp_value') local idp = round(`e(idp)', `dp_value') local widstat = round(`e(widstat)', `dp_value') outreg2 using "$output_mainresults\mainresults_WRE.doc", drop($t `v') ctitle(`v'_CRVE_cs) stat(coef tstat pval se ci N) addtext(idstat, `idstat', idp, `idp', widstat, `widstat', N_clust, `e(N_clust)', N_clust1, `e(N_clust1)', N_clust2, `e(N_clust2)') dec(4) append boottest indepvar_of_interest = 0, cluster(country sector) bootcluster(sector) nograph seed(999) reps(999) local b_t = `r(t)' local b_p = `r(p)' outreg2 using "$output_mainresults\mainresults_WRE.doc", drop($t `v') ctitle(`v'_WRE_cs_bs) addtext(b_p, `b_p', b_t, `b_t', b_CIstr, `r(CIstr)') dec(4) append }

Also, I wonder whether combining 'julia' with IV regression is safe to begin with, as I did not fully understood what you meant in your statement about this in comment #7 ?

Last edited by Arjan Trinks; 13 Dec 2022, 04:43.
Comment
David Roodman

Join Date: Jul 2014

Posts: 464
#9

14 Dec 2022, 08:35

Wow, you are pushing it hard.

If you use parallel, then you are going to duplicate your huge data set many times, one for each copy of Stata that will be simultaneously launched. That could easily swamp the available memory.

I'm guessing the burden, even when not using parallel, is created by trying to copy the data for your regression from Stata to Python to Julia. That in itself might use up your memory.

I think if you are going to work at such a scale, it would be worth exploring working directly in Julia. You could try doing it through notebooks. Notice that in the Julia WildBootTests.jl example, the regression of interest is never run, just the wild bootstrap test. While boottest is constructed as a post-estimation command in Stata, in reality it hardly relies on any estimates that are performed. It does the estimation itself. So no estimation is needed before calling wildboottest() in Julia.

But you could also run the regression in Julia. There's an analog to reghdfe and ivreghdfe in Julia, which is apparently a lot faster, though I haven't used it.

Stata files can be loaded easily in Julia:

Code:

using DataFrames, StatFiles df = DataFrame(load("mydir/nlsw88.dta"))

boottest with the julia option does work after IV. If you have many small clusters, it might (currently) be slower than boottest.

If you stay in Stata, you might need boottest's matsize() option. I'd start with small examples--a subsample of the data, a low number of reps()--then build up.
Comment
David Roodman

Join Date: Jul 2014

Posts: 464
#10

14 Dec 2022, 09:07

And maybe the float(32) option will help when you call Julia from boottest.
Comment
Arjan Trinks

Join Date: Sep 2016

Posts: 11
#11

17 Jan 2023, 08:41

Dear David Roodman : underneath the Christmas tree, I've tried your option of running everything directly from Julia. Given that the problem lies in memory usage, which might be handled much more efficiently in Julia.

Below the conclusions, which seem a bit disappointing.
1. Indeed, the 'ivreghdfe' version in Julia is much much faster than Stata, for our large dataset. One IV fixed effects regression without "boottesting" anything takes 2 minutes whereas Stata takes 2 hours.
2. The wildboottest command in Julia, however, turns out to be very slow and after 80+hours results in an error. It also appears that Julia doesn't use all the available memory, by far not.

I sent you some more information on precisely how we ran it in Julia, in a direct message. Possibly you might spot a mistake or have a further suggestion.
Comment
David Roodman

Join Date: Jul 2014

Posts: 464
#12

17 Jan 2023, 11:41

Arjan Trinks, as I say you certainly are pushing the code hard. Before I wrote boottest, it took a matter of minutes to wild-bootstrap a simple OLS regressions with one-way clustering. Now it takes a fraction of a second, and people's expectations have shifted too.
That said, there is one significant change I can make to the Julia back end, which would flow through to boottest when used with the Julia option, and which might help. It is not trivial to implement, but may not be too hard once confronted. About a year ago I optimized the IV code in a way that is great when clusters are few but may backfire when clusters--or intersections of clusters when there is multiway clustering--are many. I probably need to restore the old code alongside the new, and have the program pick which implementation to run in any given case.
Does it do much better if you drop the multiway clustering?
1 like
Comment

Announcement

Julia back end integrated into boottest

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment