Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Modulo, Mathematics, and Stata Notation

    For a simulation, there's an equation that (in part) takes the form of

    0.3.(t mod(T + 1))−(t mod(10)).sin(t/pi)

    where t reflects a time period of T, and T reflects the total number of periods. Let's simulate part of it really fast.
    Code:
    clear
    
    set obs 100 // 100 units
    
    qui g id = _n // our ID
    
    expand 2000 // 2000 time periods
    
    qbys id: g time = _n // 1...2000
    The outcome is where I'm having trouble. I recognize the "mod" as the modulo operator. But I'm curious how we'd implement this mathematically in Stata.

    More concretely, does this mean we'd do
    Code:
    su time, mean
    
    loc max = r(max)+1
    
    qbys id: g y = 0.3 * mod(time,`max') - ///
    mod(time,10)*sin(time/_pi)
    When I see t mod(T+1), it seems like it implies multiplication, but that doesn't make any sense. Does my code here seem like the right way to think about this?

  • #2
    Your code looks good to me. I don't understand what you mean by "When I see t mod(T+1), it seems like it implies multiplication, but that doesn't make any sense."

    I'll tell you, however, what doesn't make sense to me in this. If T is the total number of periods, and t is an index that ranges from 1 to T, then it is always the case that t < T+1. That, in turn implies that mod(t, T+1) == t. So why the unnecessary complexity? Why doesn't the formula just have 0.3*t - mod(t, 10)*sin(t/_pi)? Perhaps we are both missing something here?

    Comment


    • #3
      That's quite a good question. The formula was obtained from this Journal of Machine Learning (page 25-26, for quick reference). The reason I say that t is indexed to the time period, is because that's what it appears they write on page 25 at the bottom. Precisely, they say that "For each time t ∈ [T], we assign latent variable ρ_t = t." So in context of Stata, it seemed like
      Code:
      loc int = 1600
      
      loc f1 mod(time,360)
      
      loc f2 mod(time,180)
      
      loc f3 2*mod(time,360)
      
      loc f4 2*mod(time,180)
      
      loc trid = 70
      
      clear
      
      set obs 100
      
      egen id = seq(), f(1) t(100)
      
      expand 2000 // time periods
      
      
      loc var = .1
      
      bys id : g time = _n
      
      set seed 1000
      cls
      
      su time, mean
      
      loc max = r(max)+1
      
      qbys id: g y = runiform() + ///
          (.3*runiform()*(time/`max'))* ///
          exp(1)^(time/`max') + ///
          cos(((`f1')*_pi)/180) + ///
          0.5*sin(((`f2')*_pi)/180) + ///
          1.5*cos(((`f3')*_pi)/180) - ///
          0.5*sin(((`f4')*_pi)/180)
      was the way to go.
      I originally thought that rho was meant to represent some specific number, a constant perhaps, but my reading of their equation here is that rho is meant to represent either a random variable or the time period in question. I could be wrong though, these folks are electric engineers and computer science folks, so perhaps I misunderstand what they mean by defining a latent variable.

      Comment

      Working...
      X