Rewriting R code

What is this code doing?

Have a look at the following simple code control.

  d = 0
  wkd = 7
  w = 0
  nd = 0
  y = 0
  while(nd < 1000){
    d = d + 1
    if (d == wkd){
      d = 0
      w = w + 1
    }
    nd = nd + 1
    if (d == 365){
      y = y + 1
    }
  }
  print(w)
  print(y)

In the end you probably figured out what it does. It basically counts the number of days and weeks in the 1000 days that begin the statement. Well, the truth is, that you read it and have no idea what is happening. THat is bad. Nobody can fix if something goes wrong. There might be bug inside that you miss and you have no idea where - and there is a bug in the code. So what you happen if we just fix it with comments?

  #day counter
  d = 0
  #days in a week
  wkd = 7
  #number of weeks
  w = 0
  #number of days
  nd = 0
  #number of years
  y = 0
  while(nd < 1000){
    d = d + 1 #add a day
    if (d == wkd){ #check if week
      d = 0
      w = w + 1 #add 1 to weeks
    }
    nd = nd + 1 #add counter
    if (d == 365){ #check if days equal number of days in a year
      y = y + 1 #add one to year
    }
  }
  print(w)
  print(y)

Did I solve something? No. I made thing worse, less readable and longer. So let's make it slightly better.

  dayCounter = 0
  daysInWeek = 7
  nWeeks= 0
  nDays = 0
  nYears = 0
  while(nDays < 1000){
    dayCounter = dayCounter + 1
    if (dayCounter == daysInWeek){
      dayCounter = 0
      nWeeks = nWeeks + 1
    }
    nDays = nDays + 1
    if (dayCounter == 365){
      nYears = nYears + 1
    }
  }
  print(nWeeks)
  print(nDays)

Suddenly we don't need comments and also the bug is immediatelly clear.

if (dayCounter == 365){
  nYears = nYears + 1
}

Should be

if (nDays == 365){
  nYears = nYears + 1
}

But we can make things even better. Days in week is a constant, so we mark it as such. Also, days in week are variable but days in year are not. That is inconsistent. So following is more clear

DAYS_IN_WEEK = 7
DAYS_IN_YEAR = 365

Rewrite one line if for shorter code (if the language allows it)

if (dayCounter == 365) nYears = nYears + 1

And most importantly, put everything that does one thing into a function. Now the problem with R is that id doesn't allow two outputs from function, so we need to be smart about it. One option is to write two functions.

function = WeeksInDays(nDays){
  DAYS_IN_WEEK = 7
  dayCounter = 0
  nWeeks = 0
  for(i in 1:nDays){
    dayCounter = dayCounter + 1
    if (dayCounter == DAYS_IN_WEEK){
      dayCounter = 0
      nWeeks = nWeeks + 1
    }
  }
  return(nWeeks);
}
function = YearsInDays(nDays){
  DAYS_IN_YEAR = 365
  dayCounter = 0
  nYears = 0
  for(i in 1:nDays){
    dayCounter = dayCounter + 1
    if (dayCounter == DAYS_IN_YEAR){
      dayCounter = 0
      nYears = nYears + 1
    }
  }
  return(nYears);
}
print(WeeksInDays(1000))
print(YearsInDays(1000))

Now that looks better but actually, for loop is crap to do such thing. Always try to thing if there is simple sollution that uses less lines of code. Formely it was paramount to write economic code, but now readibility is preffered. So what about modulo?

function = WeeksInDays(nDays){
  DAYS_IN_WEEK = 7
  remainderDays = nDays %% DAYS_IN_WEEK
  nWeeks = (ndays - remainderDays) / DAYS_IN_WEEK
  return(nWeeks);
}
function = YearsInDays(nDays){
  DAYS_IN_YEAR = 365
  remainderDays = nDays %% DAYS_IN_YEAR
  nYears = (ndays - remainderDays) / DAYS_IN_YEAR
  return(nYears);
}
print(WeeksInDays(1000))
print(YearsInDays(1000))

And as a last touch, we see there is duplicit code, so let's deal with it right now.

function = DivideWithoutRemainder(num1, num2){
  remainder = num1 %% num2
  return((num1 - remainder) / num2)
}
function = WeeksInDays(nDays){
  nWeeks = DivideWithoutRemainder(nDays, 7)
  return(nWeeks);
}
function = YearsInDays(nDays){
  nYears = DivideWithoutRemainder(nDays, 365)
  return(nYears);
}
print(WeeksInDays(1000))
print(YearsInDays(1000))

This created legible structured code that is easy to maintain as well as understand. Didn't we go quite a trip from the horrific start?