Announcement

Collapse
No announcement yet.

Calc and multiple linear regression

Collapse
This topic is closed.
X
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

    Calc and multiple linear regression

    I have read that Excel solves multiple linear regression problems that contain 1 or more dummy variables (a variable whose value must be 0 or 1 to indicate the presence or absence of some attribute). I tried to solve such a problem using Calc's Linest function, and it failed. I looked at Calc's user Guide, and I see no mention of dummy variables. Does anyone know if Calc can solve such problems? If Calc cannot, then do you know of any open source software that does solve such problems?

    Re dummy variables, I think of them as vectors whose elements are all 0 or 1.

    #2
    Can you provide an example of a 'multiple linear regression problem' for evaluation?
    Windows no longer obstructs my view.
    Using Kubuntu Linux since March 23, 2007.
    "It is a capital mistake to theorize before one has data." - Sherlock Holmes

    Comment


      #3
      Seems to work here, using KCalc on 16.04.3 LTS.

      I followed an example for Excel at http://www.statisticshowto.com/proba...sion-equation/

      The input results:

      SUBJECT AGE X GLUCOSE LEVEL Y Regression
      1 43 99
      2 21 65 Regression Model Linear
      3 25 79 R^2 0.2806974725
      4 42 75 Standard Error 10.8611976552
      5 57 87
      6 59 81 Slope 0.3852249832
      Σ 247 486 Intercept 65.1415715245
      43 81.7062458026
      21 73.2312961719
      25 74.7721961048
      42 81.3210208193
      57 87.0993955675
      59 87.8698455339
      Windows no longer obstructs my view.
      Using Kubuntu Linux since March 23, 2007.
      "It is a capital mistake to theorize before one has data." - Sherlock Holmes

      Comment


        #4
        Thanks for your last post. The problem you posted is a simple linear regression problem, for it has only 1 predictor variable, namely age. I need to solve a regression problem that (1) has several predictor variables and (2) one of the predictor variables is a dummy variable (its value can be only 0 or 1). To give you an idea of what I'm talking about, consider this text:

        I want to analyze real estate sales data to derive an equation that can be used to predict the value of a home having such characteristics as (1) square footage of the home, (2) number of bedrooms, (3) number of bathrooms, (4) size of lot (in square feet), (5) year built, and (6) does it have a swimming pool (Yes or No)--1 for Yes; 0 for No. Of those 6 variables, only #6 is a "dummy variable".

        That problem differs from the problem you posted in 2 ways, First, it has multiple predictor variables and (2) one of them (number 6) must be either 0 or 1 for each home. I can gather the numeric data from my local Multiple Listing Service (I am a Realtor). I can get the sales price of homes that have sold in the last say 6 months. For each such home, I can get the values for the 6 predictor variables I just named. I would have no problem using Calc's Linest function to get the equation were it not for the swimming-pool variable.

        I used the Linest function on these data that I concocted just for test purposes:

        Sale Price Square Feet Bedrooms S. Pool
        200,000 2,500 3 0
        240,000 2,600 4 0
        310,000 2,700 4 1

        I put those data into columns A, B, C, and D. The numbers started in row 2. So I entered A2:A4 as the Y data; B2:C4 as the X data, and it worked, but when I changed C4 to D4, it promptly said Err:504. That error code is "Parameter list error", which it explains as "Function parameter is not valid; for example, text instead of a number, or a domain reference instead of a cell reference." So Linest evidently does not handle dummy variables. I have read that Excel does handle dummy variables. I'm using version 5.1.6.2 of LibreOffice. The swimming-pool variable is important. I do not want to have to exclude it. Any suggestions?

        Last edited by wtb32141; Dec 21, 2017, 04:14 AM.

        Comment


          #5
          Would https://help.libreoffice.org/Calc/Solver be of relevance?
          Kubuntu 20.04

          Comment


            #6
            Thanks for the suggestion. I'll check it out.

            Comment


              #7
              Google Sheets has a Statistics package with a Regression component. That component solves the type of problem that I sketched earlier; so I have been using Google sheets to solve regression problems that include or exclude a dummy variable. I'm pleased with the results.

              If you add a dummy variable, then to get it to work, I had to enter Yes or No instead of 1 and 0 or even Y or N.
              Last edited by wtb32141; Dec 28, 2017, 07:03 AM.

              Comment

              Working...
              X