
Emerging Trends in Big Data Technology and Capabilities
Explore the evolution of Big Data technology, tools, and capabilities in the early 21st century, including the emergence of open-source tools, structured and unstructured data handling, and the significant role played by leading high-tech companies in advancing this technology.
Download Presentation

Please find below an Image/Link to download the presentation.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author. If you encounter any issues during the download, it is possible that the publisher has removed the file from their server.
You are allowed to download the files provided on this website for personal or commercial use, subject to the condition that they are used lawfully. All files are the property of their respective owners.
The content on the website is provided AS IS for your information and personal use only. It may not be sold, licensed, or shared on other websites without obtaining consent from the author.
E N D
Presentation Transcript
Big Data Technology and Technological Capability Nayem Rahman Department of Engineering and Technology Management Portland State University November 18, 2020
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Agenda Introduction Literature Review Qualitative Methods Research Model Research Design Data Collection and Analysis Results and Discussion Conclusions and Contributions 2
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Introduction In the early 21st century Big Data has come into the picture To handle big data a completely new set of tools and technologies have emerged A non-profit organization, Apache Software Foundation, has provided a handful of open-source big data tools and technologies The inventors, contributors and early adopters of big data tools and technologies include leading high-tech companies Google, Yahoo, Facebook, Microsoft, Amazon, Intel and IBM Big Data has emerged with a handful of tools and technologies, and leading high-tech companies are pioneers of using this new technology 3
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Structured and Unstructured Data We deal with structured data by storing in conventional database systems IBM DB2, Oracle, MS SQL, Teradata, MySQL Big Data is large and complex, and cannot be stored in conventional data storage Big Data includes Internet-generated data, machine generated (sensor) data and social networking (e.g., Twitter) data More than 90% data are unstructured [1, 36] The unstructured nature of big data makes it distinct from conventional transactional data which warrants the use of new and different set of tools and technologies 4
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Sources of Big Data [40] [13] Source: statista.com Big Data consists of Internal Transactional and External data 5
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Big Data Characteristics [13] Source: statista.com Big Data has distinct characteristics. Given these characteristics, what makes big data technology useful? 6
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Big Data Technology Apache Hadoop Distribution Open Source HDFS and MapReduce/ Spark Hadoop, particularly for large-scale, on-premise deployments Alternative Cloud Platforms Vendor-provided [10] Google Cloud Platform: Dataproc, BigQuery, GCS, Cloud SQL Amazon Elastic MapReduce (EMR): Amazon S3, Apache Spark, Apache Hive, Apache HBase Microsoft Azure: Azure Data Explore, Cosmos DB, Azure Data Lake, Azure HDInsight, Azure Stream Analytics Big Data Technologies include on-premise and cloud-based distributions 7
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Problem Statement Data growth in companies Large volume of enterprise data Data available from external sources Challenges in managing unstructured data Practical obstacles in implementing big data projects [39] Big Data Technologies are complex specialized skillsets are required Many companies are not sure about business value of big data projects [38] Adoption rate [14] related to big data technology is still low which indicates the need for more research to understand users adoption of big data technologies 8
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Research Gaps and Questions Research Gaps Research Goals Research Questions Most of the empirical research has focused on technical aspects (algorithms and machine learning, etc.) and system development. There is a lack of an in-depth analysis of the factors that influence the adoption of big data technology [4] To explore and study the key factors that are associated with users adoption and use of Hadoop in the U.S. What are the key factors that are associated with industry users behavioral intention to adopt and use the Hadoop technology? Leading Technology Acceptance Model researchers point out that very little research efforts are investigating what actually makes a system useful. Perceived usefulness and perceived ease of use have largely been treated as black boxes [2] Developing a research model based on the existing IT models that include important factors identified, reviewed, and evaluated through a number of qualitative methods to better study the key factors that are associated with users intention to use of Hadoop How can users experience of the Hadoop be improved/enhanced? Most empirical studies of TAM are criticized due to lack of the important technological factors in them and TAM S ability to explain up to 47% of variances [6, 3] A literature review has been conducted and a number of research gaps , goals, and questions have been identified 9 9
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Adoption Factors Taxonomy Based on Literature The literature review resulted in 32 factors 10
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Research Steps Qualitative Study Research step Description An extensive literature review has been conducted and a taxonomy of factors related to the adoption and use of big data technology was developed. Targeted participants Literature Review A one-hour brainstorming session was conducted with 9 industry experts. Experienced users of Hadoop ( with at least 3 years of usage) and individuals who have work experience in industry and its related sectors Brainstorming A one-hour focus group session was conducted with 10 participants. Focus Group Individual interviews were conducted with 21 participants from 13 companies. Interviews took 15 to 20 minutes each. Interviews A number of qualitative methods have been conducted to introduce new factors, and evaluate, select and validate factors that are more associated with adoption of Hadoop and can be added into the research model. 11
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Proposed Research Model Based on Qualitative Study Training & Skills (TR) Scalability (SC) Functionality (FN) H1 Data Storage & Processing (DS) H11 H8 H2 Security & Privacy (SP) H5 H9 Flexibility (FL) H13 H15 Actual Use (AU) Perceived Usefulness (PU) Behavioral Intention (BI) Data Analytics Capability (DA) H7 H10 H14a H14b Output Quality (OQ) Perceived Ease of Use (PEOU) H4 Performance Expectancy (PE) H6 H3 H12 Facilitating Conditions (FC) Cost-Effectiveness (COST) Reliability (RL) 12
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Research Design Data Collection Web-based Survey ( Using Qualtrics software) Expert Panel Validation Invitations sent to Hadoop User Groups (with an e-link to the web- based survey) Data Analysis Using Structural equation modeling (SEM), AMOS 26 software Population 14 Hadoop User Groups in the U.S. with total users: 33K Targeted Population Two user groups out of 14 user groups: Bay Area Hadoop User Group, and Hadoop-NYC User Group 13
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Research Design Sampling Method Cluster Sampling Population is divided into separate groups Clusters need to be homogenous and each cluster should have distinct subpopulations Big data user groups as clusters Hadoop users are organized in different Hadoop user groups Two user groups or clusters are randomly selected out of 14 user groups Sample consists of every member from these two Hadoop user groups Sampling Frame Two Hadoop user groups consisting of 10,500 subscribers 14
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Developing and validating the survey instrument Developing and validating the survey instrument No. Step Description Outcome 1. Preliminary Version Developed the initial version based on previous and related surveys; added new questions Version 1 2. Pre-Validate 1 Big data experts from industry were used for feedback and comments Version 2 3. Pre-Validate 2 TAM experts from academia were used for feedback and comments. They have research backgrounds Version 3 4. Expert Panel The validation tool was used to rate/ judge the relevance of each question and ease of answering each question Version 4 5. Pilot Test This test was conducted using a web-based survey via email sent to a group of Hadoop users Version 5 15
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Survey Administration An invitation email was sent to the two selected Hadoop User Groups consisting of 10,500 possible participants, which included a link to the Web-based survey Two follow up reminders were sent 53 responses were deleted during data screening which brought the final RR to 349/10,500 16
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Survey Administration: Non-Response Bias Estimating non-response bias - Extrapolation Method Successive waves of a questionnaire ... persons who responded in later waves are assumed to have responded because of the increased stimulus and are expected to be similar to non-respondents [9] 17
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Survey Administration Data Screening: 53 non engaged respondents have been identified and removed No outliers (since a Likert scale of 1 to 5 was used) Test for non-response Errors: ANOVA analysis showed no significant difference among respondents answers from the three waves of the survey Instrument Reliability through Internal Consistency: Cronbach s Alpha, Average Variance Extracted (AVE), and Construct Reliability (CR) 18
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Survey Respondents Industry Profile 19
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Survey Respondents Job Roles 20
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Data Analysis Using SEM Data Screening & test the internal consistency between items and construct SEM process usually goes through number of steps [6]: Measurement model, to obtain an acceptable model fit. (variables/constructs) 2. Developing and specifying the measurement model 3. Designing a study field to produce empirical results 4. Assessing the measurement model validity 5. Specifying the structural model 1. Defining the individual factors No Acceptable model fit yes Measurement model will be converted into a Structural model to test the hypothesized relationships among the model constructs. Results & Discussion 21 21
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Discriminant Validity Factor Correlation Estimates Correlation Squared Correlations AVE1 AVE2 AVE1 AVE2 square roots should be > estimates Correlation (R-squared) (AVEs should be > R-squared) SC <--> DS SC <--> PE SC <--> RL SC <--> FL SC <--> OQ SC <--> TR SC <--> PEOU SC <--> FC DS <--> PE DS <--> RL DS <--> FL DS <--> OQ DS <--> TR DS <--> PEOU DS <--> FC PE <--> RL PE <--> FL PE <--> OQ PE <--> TR PE <--> PEOU PE <--> FC RL <--> FL RL <--> OQ RL <--> TR RL <--> PEOU RL <--> FC FL <--> OQ FL <--> TR FL <--> PEOU FL <--> FC OQ <--> TR OQ <--> PEOU OQ <--> FC TR <--> PEOU TR <--> FC PEOU <--> FC 0.698 0.602 0.691 0.667 0.517 0.516 0.384 0.533 0.630 0.632 0.721 0.560 0.542 0.420 0.534 0.729 0.711 0.786 0.701 0.675 0.675 0.731 0.636 0.636 0.544 0.606 0.658 0.653 0.532 0.598 0.760 0.691 0.772 0.574 0.664 0.657 0.487 0.362 0.477 0.445 0.267 0.266 0.147 0.284 0.397 0.399 0.519 0.313 0.294 0.176 0.285 0.531 0.506 0.618 0.491 0.456 0.456 0.534 0.404 0.404 0.296 0.367 0.433 0.426 0.283 0.358 0.578 0.477 0.596 0.329 0.441 0.432 0.524 0.548 0.524 0.636 0.524 0.544 0.524 0.625 0.524 0.665 0.524 0.606 0.524 0.692 0.524 0.600 0.548 0.636 0.548 0.544 0.548 0.625 0.548 0.665 0.548 0.606 0.548 0.692 0.548 0.600 0.636 0.544 0.636 0.625 0.636 0.665 0.636 0.606 0.636 0.692 0.636 0.600 0.544 0.625 0.544 0.665 0.544 0.606 0.544 0.692 0.544 0.600 0.625 0.665 0.625 0.606 0.625 0.692 0.625 0.600 0.665 0.606 0.665 0.692 0.665 0.600 0.606 0.692 0.606 0.600 0.692 0.600 0.730 0.740 0.730 0.798 0.730 0.738 0.730 0.791 0.730 0.815 0.730 0.779 0.730 0.832 0.730 0.775 0.720 0.797 0.720 0.738 0.740 0.791 0.740 0.815 0.740 0.779 0.740 0.832 0.740 0.775 0.797 0.712 0.797 0.791 0.797 0.815 0.797 0.779 0.797 0.832 0.797 0.775 0.738 0.791 0.738 0.815 0.738 0.779 0.738 0.832 0.738 0.775 0.791 0.815 0.791 0.779 0.791 0.832 0.791 0.775 0.815 0.779 0.815 0.832 0.815 0.775 0.779 0.832 0.779 0.775 0.813 0.775 22
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Assessing Model Validity Model Fit Indices Confirmatory Factor Analysis (CFA) was conducted using IBM AMOS v26 software Modification indices have been utilized in the CFA to determine if there were opportunities to improve the model 23
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Specifying the Structural Model 24
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Final Model Training & Skills (TR) Functionality (FN) **: p<.01 Scalability (SC) N/S *: p<.05 Data Storage & Processing (DS) Security & Privacy (SP) *: p<.05 N/S R2 = .85 **: p<.01 R2 = .80 R2 = .67 Flexibility (FL) ***: p<.001 Actual Usage (AU) Behavioral Intention (BI) Perceived Usefulness (PU) N/S Data Analytics Capability (DA) ***: p<.001 **: p<.01 *: p<.05 Perceived Ease of Use (PEOU) **: p<.01 Output Quality (OQ) ***: p<.001 Performance Expectancy (PE) ***: p<.001 N/S Cost-Effectiveness (COST) Facilitating Conditions (FC) Reliability (RL) *: p<.05 25
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Results and Discussion Constructs (IV and DV) Relevance to Literature Findings & Comments Scalability Prior research has not used this factor. This a new factor validated using TAM. Significant Influence. More users might be influenced by the technological capability of a technology. Data Storage & Processing Prior research has not used this factor. This a new factor validated using TAM. Significant. Implies that users of big data technology are looking for robust storage and processing capability big data technology. Cost-Effectiveness Prior research has not tried this factor using TAM. Non-Significant. Organization might not be sensitive to cost. Respondents were developers and architects. Thus, they might not be concerned about costs. Performance Expectancy Prior research has validated this factor using UTAUT [18]. Hence, consistent with the extant literature. Significant. Implies that companies are influenced by the expected performance, benefits and gains. Security & Privacy Prior research has not tried this factor using TAM. Non-Significant. Data security and privacy has become very important. It is worth testing this construct in a future research. Reliability Prior research has not tried this factor using TAM or IS theory. Significant. It is expected to provide IT leadership with confidence in using this technology. Data Analytics Capability Prior research has not tried this factor using TAM or IS theory. Non-Significant. Hadoop s main component itself is not a specific tool used for data analytics. 26
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Results and Discussion Relevance to Literature Constructs (IV and DV) Findings & Comments Training & Skills Prior research validated this factor using TAM [26]. Thus, consistent with the extant literature. Prior research has not tried this factor using TAM or IS theory. Significant. Managers might consider providing training sessions for Hadoop users given Hadoop s complexity. Flexibility Significant. Hadoop enables one to integrate and access new sources of data, both structured and unstructured. Output Quality Prior research has validated this factor using TAM2 [28]. Hence, consistent with the extant literature. Prior research has not tried this factor using TAM or IS theory. Significant. Implies that output quality should reflect the correct data and be traceable all the way back to where it was generated. Non-Significant. I believe this factor was substituted by other capability factors such as scalability, data storage and processing, and flexibility. Functionality Facilitating Conditions Prior research validated this factor using TAM [26], UTAUT [18]. Hence, consistent with the extant literature. Prior research validated this factor using TAM [27]. Hence, consistent with the extant literature. Prior research validated this factor using TAM [27]. Hence, consistent with the extant literature. Prior research validated this factor using TAM [27]. Hence, consistent with the extant literature. Significant. External (vendor) and Internal (Organization s IT infrastructure) support is needed. Perceived Usefulness Significant. One of the core constructs of the TAM (dependent variable). Model supports 80% variance. Perceived Ease of Use Significant. One of the core constructs of the TAM (independent variable). Behavioral Intention Significant. One of the core constructs of the TAM (dependent variable). Model supports 67% variance. 27
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Conclusion This research provides certain evidence Technological capability plays an important role in enabling a complex and robust technology like Hadoop It provides an important insight A complex technology like Hadoop implementation can lead to changes in employees job characteristics and lead to the urgency of providing more training to the employees Results can be used by researchers, government and industry to increase the adoption and use of technology 28
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Theoretical Contribution This research has successfully validated new constructs Scalability, data storage and processing capability, flexibility, and reliability This study has shown that TAM is valid In a new and technologically complex system implementation It provides new evidence in technology acceptance Taking the technological capabilities into consideration in acquiring new technology It provides outcomes from an organizational level users acceptance context It advances theory Model supported 80% variances in usefulness and 67% variance in usage intentions 29
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Implications for Practitioners First, this research provides insights as to what technological characteristics and capabilities to look for when buying a complex technology Second, this provides managers action plans such as training users in order to lessen the negative effects and improve skillsets Third, managers should make sure a facilitating condition exists to support different Hadoop users By using Hadoop, organizations might be able to put together internal data and external data in HDFS 30
Thank you 31
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Hypotheses Development Scalability factor (SC) is considered an important factor in adopting and using new technologies Both academic and industry papers suggest scalability is an important factor [10, 15] The term scalability has been used in industry when it comes to buying or using a technology [10] This factor was selected as the number one factor by the experts during the qualitative research Therefore, the following hypothesis was proposed: H1: Scalability in terms of Hadoop scale-out-storage system has a positive effect on perceived usefulness Data Storage and Processing (DS) is considered to have a significant relationship with users intention to adopt and use new technologies This factor has not been used in past research as part of technology acceptance models. Organizations have been accumulating large amounts of data for years and data management is a new challenge This factor is proposed as a new construct in this research Therefore, the following hypothesis has been developed: H2: Data storage and processing have a positive effect on perceived usefulness 33
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Hypotheses Development Cost-Effectiveness factor (COST) This construct was not used as part of TAM. But it was used by researchers using other models [16, 17] There is a perception that big data tools are cost effective compared to traditional data management software systems [35]. Experts during the qualitative research selected this factor Therefore, the following hypothesis has been developed: H3: Cost effectiveness is positively related to actual use of Hadoop Performance expectancy factor (PE) The performance expectancy factor was used in the past as part of another technology acceptance model [18] Findings in a number of previous research showed that the performance expectancy factor has a strong relationship with users intention to adopt and use technology [6] Therefore, the following hypothesis has been developed: H4: Performance Expectancy is positively related to perceived usefulness of Hadoop 34
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Hypotheses Development Security and Privacy factor (SP) The extant literature shows that this construct is important from the standpoint of data privacy and security [20] Data security and privacy is receiving increasing attention these days. The experts in the qualitative research have voted this factor as the fifth most important factor Therefore, the following hypothesis has been developed: H5: Security and Privacy is positively related to perceived usefulness of Hadoop Reliability factor (RL) Based on the extant literature [3, 21], this construct has not been tested by IS theories or models in general and TAM in particular The Hadoop distributed file system (HDFS) is considered reliable as it keeps multiple copies of same data in more than one node [22] Therefore, the following hypothesis has been developed: H6: Reliability is positively related to perceived usefulness of Hadoop 35
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Hypotheses Development Data Analytics Capability factor (DA) The extant literature does not reference this factor in any IS theory or model [3, 21] Industry papers on big data suggest the importance of the data analytics capability of big data technology, including Hadoop [23, 24, 25]. The expert panel recommends this factor is included for further study Therefore, the following hypothesis has been developed: H7: Data analytics capability is positively related to perceived usefulness of Hadoop Training and Required Skills factor (TR) Recent research on big data highlighted the value of big data investments relating to training [19] In TAM research, training is found to be a significant predictor of perceived usefulness [26] Therefore, the following hypothesis has been developed: H8: Training and required skills are positively related to perceived usefulness of Hadoop 36
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Hypotheses Development Flexibility factor (FL) The extant literature suggests that this factor has not been used in TAM [3] or any other IS models before The experts in the qualitative part of this research suggest this factor to be important in Hadoop adoption Therefore, the following hypothesis has been developed: H9: Hadoop s flexibility to consolidate data from various sources to single place (storage) will have a positive effect on perceived usefulness of Hadoop Output Quality factor (OQ) Venkatesh and Davis proposed this factor as part of TAM2 [28], as a theoretical extension to the model Output quality has been found to be a significant determinant of perceived usefulness [6]. The experts in the qualitative part of this research suggests this factor to be important in Hadoop adoption Therefore, the following hypothesis has been developed: H10: Output Quality is positively related to the perceived usefulness of Hadoop 37
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Hypotheses Development Functionality factor (FN) The extant literature suggests that this construct has not been used [3, 21] Functionality provides users with the capability to do on-the-job tasks by using the software or system. This research has incorporated this factor based on the qualitative study results Therefore, the following hypothesis has been developed: H11: Functionality is positively related to perceived usefulness of Hadoop Facilitating conditions factor (FC) This factor has been part of one of the technology acceptance models [18] In a number of previous studies, the Facilitating conditions factor was found to have a significant and positive influence on users adoption and use of technology [20,26] Therefore, the following hypothesis has been developed: H12: Facilitating Conditions have positive effect on Actual Use of Hadoop 38
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Hypotheses Development Perceived Usefulness factor (PU) Perceived usefulness as a significant predictor of behavioral intention to use technology was supported in studies by Davis [5, 27], Adams et al. [29] and many other researchers The extant literature reports that perceived usefulness is a major determinant in the U.S. workplace [30] Therefore, the following hypothesis has been developed: H13: Perceive Usefulness has positive effect on Behavioral Intention in using Hadoop Perceived Ease of Use factor (PEOU) Perceived ease of use is a core construct of Davis original TAM [27] In Perceived ease of use is linked to behavior intention to use indirectly (PEOU PU BI) which is supported by extensive evidence [28] Therefore, the following hypothesis has been developed: H14a: Perceived Ease of Use (PEOU) has positive effect on Perceive Usefulness (PU) in using Hadoop 39
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Hypotheses Development Perceived Ease of Use factor (PEOU) Perceived ease of use is a core construct of Davis original TAM [27] In Perceived ease of use is linked to behavior intention to use directly (PEOU BI) which has extensive evidence in support of that [28] Therefore, the following hypothesis has been developed: H14b: Perceived Ease of Use (PEOU) has positive effect on Behavioral Intention to using Hadoop Behavioral Intention factor (BI) is found to be a strong and important predictor of usage behavior (actual use) This is one of the main constructs of TAM developed by Davis [27]. This construct is also used in a later model, UTAUT developed by Venkatesh et al. [18] A number of studies show behavioral intention is likely to be correlated with actual usage Therefore, the following hypothesis has been developed: H15: Behavioral Intention (BI) has positive effect on Actual Use of Hadoop 40
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Limitations This study relies on respondents self-reported data. Some researchers suggest that self-reported usage does not always reflect actual usage a commonly reported limitation [3] This study collected data at a single point of time as opposed to different time periods The results can not be generalized to all of the U.S. Hadoop users or any other big data technology users since the data have been collected from two Hadoop user groups 41
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis Future Research This research validated new independent variables. To give them a widespread validity, further studies are needed This research used actual users. Future research might use managers and executives The data in this research were collected from two Hadoop user groups. A future research could extend the data collection to include all Hadoop user groups 42
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis References [1] Baesens, B., Bapna, R., Marsden, J.R., Vanthienen, J., and Zhao, J.L. (2016). Transformational issues of big data and analytics in networked business. MIS Quarterly, 40(4), 807-818. [2] Benbasat, I., & Barki, H. (2007). Quo vadis, TAM? Journal of the Association for Information Systems 8(4), 211-218. [3] Lee, Y., Kizar, K.A., & Larsen, K.R.T. (2003). The technology acceptance model: Past, present, and future, Communications of the Association for Information Systems 12, 752-780. [4] Kwon, O., Lee, N., and Shin, B. (2014): Data quality management, data usage experience and acquisition intention of big data analytics. International Journal of Information Management 34 (2014) 387 394. [5] Davis, F.D. (1993). User acceptance of computer technology: system characteristics, user perceptions, International Journal of Man-Machine Studies, 38(3), 475-487. [6] Aldhaban, F. (2016). Exploratory Study of the Adoption and Use of the Smartphone Technology in Emerging Regions: Case of Saudi Arabia. PhD Dissertation in Technology Management, Portland State University, 2016. 45
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis References [7] Hood-Clark, S.F. (2016). Influences on the use and behavioral intention to use big data. Doctoral Dissertation of the School of Business and Technology, Capella University, USA. Publisher: ProQuest LLC. [8] Malaka, I., & Brown, I. (2015). Challenges to the Organisational Adoption of Big Data Analytics: A Case Study in the South African Telecommunications Industry. In Proceedings of the ACM 2015 Annual Conference of the South African Institute of Computer Scientists and Information Technologists (SAICSIT 2015). September 28-30, 2015. Stellenbosch, South Africa. [9] Armstrong, J.S., & Overton, T.S. (1977). Estimating nonresponse bias in mail surveys. Journal of Marketing Research, 14(3), 396-402 [10] Rosencrance, L. (2019). The main picks for Hadoop distributions on the market. Link: https://searchdatamanagement.techtarget.com/feature/The-main-picks-for- Hadoop-distributions-on-the-market [11[ Columbus, L. (2017). 53% Of Companies Are Adopting Big Data Analytics. Forbes. Retrieved on 4/29/2020 from: https://www.forbes.com/sites/louiscolumbus/2017/12/24/53-of-companies-are- adopting-big-data-analytics/#58231c1a39a1 46
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis References [12] Kulkarni, R. (2019). Big Data Goes Big. Forbes. Retrieved on 4/29/2020 from: https://www.forbes.com/sites/rkulkarni/2019/02/07/big-data-goes-big/#1a63d3e520d7 [13] SAS Insights (2020). Hadoop History. Retrieved on 4/25/2020 from: https://www.sas.com/en_us/insights/big-data/hadoop.html [14] Technavio (2020). Big Data Market 2020-2024: Growing Investment in Smart City Initiatives to Boost Growth. Technavio. Link: https://www.businesswire.com/news/home/20200309005078/en/Big-Data-Market- 2020-2024-Growing-Investment-Smart [15] Sen, A., & Sinha, A.P. (2005). A comparison of data warehousing methodologies. Communications of the ACM 48(3), 79-84. [16] Premkumar, G., & Potter, M. (1995). Adoption of computer aided software engineering (CASE) technology: An innovation adoption perspective. The DATABASE for Advances in Information Systems, 26(2-3), 105-124. [17] Phan, K., and Daim, T. (2011). Exploring technology acceptance for mobile services. Journal of Industrial Engineering and Management, 4, 339-360, 2011 [18] Venkatesh, V., Morris, M.G., Davis, G.B., and Davis, F.D. (2003). User Acceptance of Information Technology: Toward a Unified View. MIS Quarterly, 27(3), 425-478. 47
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis References [19] Tambe, P. (2016). Big data investment, skills, and firm value. MIT Research Brief, 9, 1-6. [20] Moody, G.D., Siponen, M., and Pahnila, S. (2018). Toward a unified model of information security policy compliance. MIS Quarterly, 42(1), 285-311. [21] Hameed, M.A., Counsell, S., and Swift, S. (2012). A conceptual model for the process of IT innovation adoption in organizations. Journal of Engineering and Technology Management, 29, 358-390 [22] Shvachko, K., Kuang, H., Radia, S., & Chansler, R. (2010). The Hadoop Distributed File System. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), 1-10, IEEE Computer Society, Washington, DC, USA. [23] Abbasi, A., Sarker, S., and Chiang, R.H.L. (2016). Big Data Research in Information Systems: Toward an Inclusive Research Agenda. Journal of the Association for Information Systems, 17(2). Article 3. DOI: 10.17705/1jais.00423. [24] Akoka, J., Comyn-Wattiau, I., and Laoufi, N. (2017). Research on Big Data - A systematic mapping study. Computer Standards & Interfaces, 54, 105-115. [25] Gandomi, A, and Haider, M. (2015). Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management, 35, 137-144. 48
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis References [26] Rajan, C.A., & Baral, R. (2015). Adoption of ERP system: An empirical study of factors influencing the usage of ERP and its impact on end user. IIMB Management Review (2015) 27, 105-117. [27] Davis, F.D. (1989). Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology, MIS Quarterly, 13(3), 319-340. [28] Venkatesh and Davis (2000). A theoretical extension of the technology acceptance model: Four longitudinal field studies. Management Science, 46(2), 186-204. [29] Adams, D., Nelson, R., & Todd, P. (1992). Perceived usefulness, ease of use and usage of information technology: a replication. MIS Quarterly, 16(2), pp. 227-247. [30] Igbaria, M., Iivari, J., and Maragahh, H. (1995). Why do individuals use computer technology? A Finsih case study. Information & Management, 29, 227-238. [31] Hess, T.J., McNab, A.L., and Basoglu, K.A. (2014). Reliability generalization of perceived ease of use, perceived usefulness, and behavioral intentions. MIS Quarterly, 38(1), 1-28. [32] Hendrickson, A.R., Massey, P.D., & Cronan, T.P. (1993). On the Test-Retest Reliability of Perceived Usefulness and Perceived Ease of Use Scales. MIS Quarterly 17(2), 227-230. 49
Literature Review Qualitative Methods Research Model Research Hypotheses Research Design Data Collection Results & Discussion Conclusions & Contributions Introduction Data Analysis References [33] Venkatesh, V. (2000). Determinants of Perceived Ease of Use: Integrating Control, Intrinsic Motivation, and Emotion into the Technology Acceptance Model. Information Systems Research, 11(4), 342 365. [34] Gefen, D., and Straub, D.W. (2000). The Relative Importance of Perceived Ease of Use in IS Adoption: A Study of E-Commerce Adoption. Journal of the Association for Information Systems, 1(1), 1-28. Article 8. DOI: 10.17705/1jais.00008. [35] LearnTek (2018). The 6 Top Hadoop Distributions that You Can Employ for Your Big Data Needs. Retrieved on 4/30/2020: https://www.learntek.org/blog/top-hadoop- distributions/ [36] Das, T.K., and Kumar, P.M. (2013). BIG Data Analytics: A Framework for Unstructured Data Analysis. International Journal of Engineering and Technology (IJET), 5(1), 153-156. [37] Khan Academy (2020). Sampling methods review. Retrieved on 5/14/2020 from: https://www.khanacademy.org/math/statistics-probability/designing-studies/sampling- methods-stats/a/sampling-methods-review [38] Gartner, Inc., 2015. Gartner Survey Highlights Challenges to Hadoop Adoption, Gartner, Inc., Stamford, CT, USA (http://www.gartner.com/newsroom/id/3051717) 50