Information

Could physically simulating neural structure on a fundamental level yield superior results to machine learning algorithms?

Could physically simulating neural structure on a fundamental level yield superior results to machine learning algorithms?



We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

My curiosity is merely of whether (in a future where we have computers with the processing power of the human brain) it's possible that actually simulating a neural network's physical behavior on the most fundamental level might be more effective than using algorithms to reach the same result.

I'm hoping for an answer which argues this topic on a purely fundamental level.


Is it likely (or not) that we might achieve a superior (more efficient, more intelligent, more dynamic) result by simulating the advanced behavior of neural interaction (the neuro-activity patterns the brain experiences to form emotions, develop conclusions, processes sensory data, establish and/or revisit memories, etc) than we can reach by using shortcuts (advanced machine learning algorithms, etc)?


Very interesting question. Although I have not a single little bit of expertise in this area, I do have some references you may want to read. First is a paper by Merz and Fromherz (2005) where they grew snail neurons on a silicon chip. Pfister et al (2007) also tried to grow neurons to allow interfacing between neurons and machine (for neural prosthesis e.g.). There is thus definitely a field of research that believe neural interfacing may be better than mechanical and algorithmic networks.

There has been 10 years worth of research since these papers so I bet there are some advancements that may give you a more definitive answer than mine.


This article offers a formal account of curiosity and insight in terms of active (Bayesian) inference. It deals with the dual problem of inferring states of the world and learning its statistical structure. In contrast to current trends in machine learning (e.g., deep learning), we focus on how people attain insight and understanding using just a handful of observations, which are solicited through curious behavior. We use simulations of abstract rule learning and approximate Bayesian inference to show that minimizing (expected) variational free energy leads to active sampling of novel contingencies. This epistemic behavior closes explanatory gaps in generative models of the world, thereby reducing uncertainty and satisfying curiosity. We then move from epistemic learning to model selection or structure learning to show how abductive processes emerge when agents test plausible hypotheses about symmetries (i.e., invariances or rules) in their generative models. The ensuing Bayesian model reduction evinces mechanisms associated with sleep and has all the hallmarks of “aha” moments. This formulation moves toward a computational account of consciousness in the pre-Cartesian sense of sharable knowledge (i.e., con: “together” scire: “to know”).

This article presents a formal (computational) description of epistemic behavior that calls on two themes in theoretical neurobiology. The first is the use of Bayesian principles for understanding the nature of intelligent and purposeful behavior (Koechlin, Ody, & Kouneiher, 2003 Oaksford & Chater, 2003 Coltheart, Menzies, & Sutton, 2010 Nelson, McKenzie, Cottrell, & Sejnowski, 2010 Collins & Koechlin, 2012 Solway & Botvinick, 2012 Donoso, Collins, & Koechlin, 2014 Seth, 2014 Koechlin, 2015 Lu, Rojas, Beckers, & Yuille, 2016). The second is the role of self-modeling, reflection, and sleep (Metzinger, 2003 Hobson, 2009). In particular, we formulate curiosity and insight in terms of inference—namely, the updating of beliefs about how our sensations are caused. Our focus is on the transitions from states of ignorance to states of insight—namely, states with (i.e., con) awareness (i.e., scire) of causal contingencies. We associate these epistemic transitions with the process of Bayesian model selection and the emergence of insight. In short, we try to show that resolving uncertainty about the world, through active inference, necessarily entails curious behavior and consequent ‘aha’ or eureka moments.

The basic theme of this article is that one can cast learning, inference, and decision making as processes that resolve uncertainty about the world. This theme is central to many issues in psychology, cognitive neuroscience, neuroeconomics, and theoretical neurobiology, which we consider in terms of curiosity and insight. The purpose of this article is not to review the large literature in these fields or provide a synthesis of established ideas (e.g., Schmidhuber, 1991 Oaksford & Chater, 2001 Koechlin et al., 2003 Botvinick & An, 2008 Nelson et al., 2010 Navarro & Perfors, 2011 Tenenbaum, Kemp, Griffiths, & Goodman, 2011 Botvinick & Toussaint, 2012 Collins & Koechlin, 2012 Solway & Botvinick, 2012 Donoso et al., 2014). Our purpose is to show that the issues this diverse literature addresses can be accommodated by a single imperative (minimization of expected free energy, or resolution of uncertainty) that already explains many other phenomena–for example, decision making under uncertainty, stochastic optimal control, evidence accumulation, addiction, dopaminergic responses, habit learning, reversal learning, devaluation, saccadic searches, scene construction, place cell activity, omission-related responses, mismatch negativity, P300 responses, phase-precession, and theta-gamma coupling (Friston, FitzGerald et al., 2016 Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, 2017). In what follows, we ask how the resolution of uncertainty might explain curiosity and insight.

1.1 Curiosity

Curiosity is an important concept in many fields, including psychology (Berlyne, 1950, 1954 Loewenstein, 1994), computational neuroscience, and robotics (Schmidhuber, 1991 Oaksford & Chater, 2001). Much of neural development can be understood as learning contingencies about the world and how we can act on the world (Saegusa, Metta, Sandini, Sakka, 2009 Nelson et al., 2010 Nelson, Divjak, Gudmundsdottir, Martignon, & Meder, 2014). This learning rests on intrinsically motivated curious behavior that enables us to predict the consequences of our actions: as nicely summarized by Still and Precup (2012), “A learner should choose a policy that also maximizes the learner's predictive power. This makes the world both interesting and exploitable.” This epistemic, world-disclosing perspective speaks to the notion of optimal data selection and important questions about how rational or optimal we are in querying our world (Oaksford, Chater, Larkin, 2000 Oaksford & Chater, 2003). Clearly, the epistemic imperatives behind curiosity are especially prescient in developmental psychology and beyond: ”In the absence of external reward, babies and scientists and others explore their world. Using some sort of adaptive predictive world model, they improve their ability to answer questions such as what happens if I do this or that?” (Schmidhuber, 2006). In neurorobotics, these imperatives are often addressed in terms of active learning (Markant & Gureckis, 2014 Markant, Settles, & Gureckis, 2016), with a focus on intrinsic motivation (Baranes & Oudeyer, 2009). Active learning and intrinsic motivation are also key concepts in educational psychology, where they play an important role in enabling insight and understanding (Eccles & Wigfield, 2002).

1.2 Insight and Eureka Moments

The Eureka effect (Auble, Franks, & Soraci, 1979) was introduced to psychology by comparing the recall for sentences that were initially confusing but subsequently understood. The implicit resolution of confusion appears to be the main determinant of recall and the emotional concomitants of insight (Shen, Yuan, Liu, & Luo, 2016). Several psychological theories for solving insight problems have been proposed—for example, progress monitoring and representational change theory (Knoblich, Ohlsson, & Raney, 2001 MacGregor, Ormerod, & Chronicle, 2001). Both enjoy empirical support, largely from eye movement studies (Jones, 2003). Furthermore, several psychophysical and neuroimaging studies have attempted to clarify the functional anatomy of insight (see Bowden, Jung-Beeman, Fleck, & Kounios, 2005), for a psychological review and Dresler et al., 2015, for a review of the neural correlates of insight in dreaming and psychosis). In what follows, we offer a normative framework that complements psychological theories by describing how curiosity engenders insight. Our treatment is framed by two questions posed by Berlyne (1954) in his seminal treatment of curiosity: ”The first question is why human beings devote so much time and effort to the acquisition of knowledge. … The second question is why, out of the infinite range of knowable items in the universe, certain pieces of knowledge are more ardently sought and more readily retained than others?” (p. 180).

In brief, we will try to show that the acquisition of knowledge and its retention are emergent properties of active inference—specifically, that curiosity manifests as an active sampling of the world to minimize uncertainty about hypotheses—or explanations—for states of the world, while retention of knowledge entails the Bayesian model selection of the most plausible explanation. The first process rests on curious, evidence-accumulating, uncertainty-resolving behavior, while the second operates on knowledge structures (i.e., generative models) after evidence has been accumulated.

Our approach rests on the free energy principle, which asserts that any sentient creature must minimize the entropy of its sensory exchanges with the world. Mathematically, entropy is uncertainty or expected surprise, where surprise can be expressed as a free energy function of sensations and (Bayesian) beliefs about their causes. This suggests that creatures are compelled to minimize uncertainty or expected free energy. In what follows, we will see that resolving different sorts of uncertainty furnishes principled explanations for different sorts of behavior. These levels of uncertainty pertain to plausible states of the world, plausible policies that change those states, and plausible models of those changes.

The first level of uncertainty is about the causes of sensory outcomes under a particular policy (i.e., sequence of actions). Reducing this sort of uncertainty corresponds to perceptual inference (a.k.a. state estimation). In other words, the first thing we need to do is infer the current state of the world and the context in which we are operating. We then have to contend with uncertainty about policies per se that can be cast in terms of uncertainty about future states of the world, outcomes, and the probabilistic contingencies that bind them. We will see that minimizing these three forms of expected surprise—by choosing an uncertainty resolving policy—corresponds to information-seeking epistemic behavior, goal-seeking pragmatic behavior, and novelty-seeking curious behavior, respectively. In short, by pursuing the best policy, we accumulate experience and reduce uncertainty about probabilistic contingencies through epistemic learning—namely, inferring (the parameters of our models of) how outcomes are generated.

Finally, curious, novelty-seeking policies enable us to reduce our uncertainty about our generative models per se, leading to structure learning, insight, and understanding. Here, a generative model constitutes a hypothesis about how observable outcomes are generated, where we entertain competing hypotheses that are, a priori, equally plausible. In short, the last level of uncertainty reduction entails the selection of models that render outcomes the least surprising, having suppressed all other forms of uncertainty. All but the last process require experience to resolve uncertainty about either the states (inference) or parameters (learning) of a particular model. However, optimization of the model per se can proceed in a fact-free, or outcome-free, fashion, using experience accumulated to date. In other words, no further facts or outcomes are necessary for this last level of optimization: facts and outcomes are constitutive of the experience on which this optimization relies. It is this Bayesian model selection we associate with fact-free learning (Aragones, Gilboa, Postlewaite, & Schmeidler, 2005) and the emergence of insight (Bowden et al., 2005).


Short-Term Power Prediction of Building Integrated Photovoltaic (BIPV) System Based on Machine Learning Algorithms

One of the biggest challenges is towards ensuring large-scale integration of photovoltaic systems into buildings. This work is aimed at presenting a building integrated photovoltaic system power prediction concerning the building’s various orientations based on the machine learning data science tools. The proposed prediction methodology comprises a data quality stage, machine learning algorithm, weather clustering assessment, and an accuracy assessment. The results showed that the application of linear regression coefficients to the forecast outputs of the developed photovoltaic power generation neural network improved the PV power generation’s forecast output. The final model resulted from accurate forecasts, exhibiting a root mean square error of 4.42% in NN, 16.86% in QSVM, and 8.76% in TREE. The results are presented with the building facade and roof application such as flat roof, south façade, east façade, and west façade.

1. Introduction

Economic growth has given rise to increasing global demand for electrical energy production and consumption. Solar power plants are very common in renewable energy sources [1–4]. Satellite technology allows us to fly around the world. In addition to being easily installed on the roof of the building, PV modules will act as stand-alone solar power generators [5–7]. The installation of photovoltaic panels has increased every year in recent years. Globally, 117 gigawatts of solar PV energy are generated in 2019 [8]. Traditional grid-based power distribution also operates on stable power supply lines and a consistent load [9]. Grid efficiency can be enhanced by controlling both suppliers and customers. Solar PV power could interfere with conventional power generation, making conventional generation uncomfortable or even unworkable [10–14].

Machine learning has become more common in forecasting and classification because it reliably processes complex or nonlinear problems. They can distinguish the relationship between input and output variables, even when the representation is unlikely [15]. The most common are artificial neural network (ANN) [16], fuzzy logic (FL) [17], support vector machine (SVM) [18], K-nearest neighbor algorithm (kNN) [19], and decision tree- (DT-) [20] based techniques (including random forest (RF) [21]. Specifically, artificial intelligence approaches are discussed in detail to improve photovoltaic performance forecasting models [22]. IRT is the most commonly accepted technique for categorising light poles [23], centred on image processing techniques to distinguish between healthy and defective panels of all image processing-based approaches. Various patterns, challenges, and opportunities for the implementation of ANN light poles are highlighted [24]. Random forests were the most reliable among the various forecasting techniques used by the site and regional forecasters [25]. Several mathematical models were developed to increase the accuracy of diagnoses [26]. Also, the use of PHANN for clear days of the sky resulted in a standard deviation of 5.3%. In [27], the authors used the recurrent neural network (LSTM-RNN) to predict future PV generation, with RMSE results of approximately 82.15 W to 136.87 W, respectively, for two separate datasets. Most PV forecasts now have a relative RMSE of more than 10% [28].

The input weights and the hidden layer biases of ELM are allocated. Randomised and then measure the least square instead of iteration approaches. The ELM will help to gain information and better transition between situations. Capability of many ELM-based models has been presented, and their excellent capacity has been verified in predicting PV power production [29]. The results are using ensemble method combining lower upper bound estimation (LUBE) and ensemble learning methods. Considering the degree of convergence and prediction precision [30], it is worked to combine ELM with the entropy method to build a hybrid forecast method for short-term PV power production, which is preferable to the radial basis function neural network and the generalised Deep Learning Network [31]. The alternative multimodel is based on ELM for PV power predictive. Forecasting is essential for operating power plants and other utilities [32]. The feasibility-prospective forecasting model is also developed and proved effective in predicting the short-term power production of PV systems. Certain parameters are allocated at random in traditional artificial neural networks (ANN), causing a certain degree of error and uncertainty in the prediction performance. Several artificial neural networks (ANNs) have been integrated with Generalised Annihilation (GA) to solve this problem. It is stated that a convolutional neural network system was effective in predicting solar irradiance, where the GA was applied to optimise the associated hyperparameters. However, in conventional ANNs, many parameters must be carefully optimised to establish learning strategies [33].

The algorithms are developed to predict PV power generation [34]. Time series have been decomposed into high- and low-frequency components. The DBN model will then be used to provide high-frequency trends. Finally, the forecasted trend components are summarised in the final results. GA is the algorithm most commonly used to solve nonlinear optimisation problems. Genetic Algorithm (GA) Theory of Evolution and Calculation of Individual Fitness Functions is predicted for evolutionary theory. The GA involves the iterative selection of elite individuals, crossover applications, and mutations [35]. support vector machine (SVM) to predict short-term solar PV power and optimise SVM parameters using the Meta-SVM Optimizer [36].

Since SVM uses Quadratic Programming, SVM training takes a long time when the number of items is large. Energy resource forecasting based on neural networks is very good at predicting solar power due to its strong task scheduling (-threshold) and outstanding mapping capability. The combined method for predicting PV power consumption with ANN and analogue integration is investigated [37]. ELM is developed and built using a neural feed-forward network (FNN). ELM can train without altering its weights and thresholds. It is characterised by rapid training speed and strong generalisation capability and strong applications [38]. The ELM model can effectively address complex nonlinear regression problems. This model has already been used to predict the radiance and power output of PV systems. The specific objective of explaining PV power fluctuations using a graphical method based on the ELM model has been reported [39]. A similarity-based photovoltaic module power prediction model is developed using the available historical data [40]. ANN model is used to predict monthly global solar radiation concerning the power predication based on the geographical location [41]. The performance of the photovoltaic module varies concerning the geographical location. The PV system’s prediction is based on the machine learning algorithm developed, and the stability of the model is validated [42]. The algorithm is developed to predict the grid’s load dispatch connected photovoltaic system for the microgrid [43].

The PV output power is determined using various algorithms to predict accuracy. A short-term, day-ahead power prediction is analysed and a long-term, day-ahead power prediction concerning climate condition. A study on the building-integrated PV system is required to be undertaken. In this study, the PV output is normalised based on experimental studies. The machine learning algorithm is used to predict the efficiency of the building integrated photovoltaic system concerning the various orientations. The systems being installed are flat, south and east oriented, and west oriented façades. The artificial neural network, decision tree, and quadratic support vector machine algorithm predict the BIPV system’s performance short-term power prediction.

2. Machine Learning Algorithms

State-of-the-art solar power technology will only be established if forecasters can predict how much solar power will be available at a specific location at a given time. The built model can be replicated since it includes only environmental data without regard to geographical locations. The machine learning models are developed with three types of training, validation, and test set depending on the design’s nature. The work flow chart is shown in Figure 1.

2.1. Artificial Neural Network

Artificial neural networks (ANNs) can describe nonlinear, complex, and incremental behaviours through input-output training patterns. An ANN characterisation is based on an architecture that shows the connections between nodes, the determination of weights methods, and the activation function.

Artificial neural networks’ ability to learn from large samples makes it neural networks possible to solve several major and complex problems [44]. The most common neural structure of the network is the feed-forward structure. A typical neural network is made up of different computational components called neurons. The input and output layer’s weights and biases are mutually optimised until the output neuron values are within possibility of weeding out false reasoning errors. This approach was successfully applied in response to regression problems [45]. This feed-forward network model is presented in Figures 2 and 3. Each layer has a hidden layer and many nodes in the hidden layer, while the user-defined function types are shown in Table 1. ANN methods can handle nonlinear systems. Still, problems of overfitting, local minima, random initial data, intensive training data requirements, and increased complexity due to multilayered architectures are the limitations [46, 47].


Background

Muscle-computer interfaces (MCIs) have found use in a broad range of clinical and biotechnical domains [1]. Most salient within the category of clinical applications is perhaps the field of hand- and wrist prosthetics, where myoelectrically controlled prostheses have been part of clinical routine since the 1960s [2]. In this application, electromyography (EMG) signals are processed by an MCI and transformed into movement commands intended to modulate the behaviour of a powered actuator, i.e. a robotic replacement limb. The prototypical system [3] designed to this end utilizes a sparse setup of surface EMG (sEMG) electrodes which measure the activities of a single antagonistic muscle pair located superficially in the residual limb of the amputee. The difference in some measure of intensity (e.g. signal magnitude) between the sEMG signals from the pair can thereafter be mapped directly to the force driving a single motorized degree of freedom (DoF) which is typically instantiated as the grasp aperture of a hand-replacing gripper. Within this framework, the additional DoFs possessed by multifunctional prostheses (which have recently become more available to hand- and arm amputees [4]) must be controlled sequentially by use of auxiliary protocols, e.g. based on co-contraction [5] or non-EMG inputs [6], for DoF switching. The enduring preponderance of this direct control framework can be understood in light of the robustness brought about by the relative simplicity of the relevant hard- and software, as well as the ease with which the intensity of contraction of a single muscle group can be controlled volitionally. However, disadvantages such as limited dexterity, lack of intuitiveness, and an associated cognitive burden have been observed among users [7] these are thought to be among the main reasons for the high abandonment rates by which devices controlled in this way are afflicted [8].

The divide that separates the direct control paradigm from advances seen in mechatronics has for a time spurred research into potential alternatives. A noteworthy candidate to this end is the use of myoelectric pattern recognition [9,10,11]—a class of methods which formulates the control problem as one of supervised machine learning. Within this framework, example segments (>) of a multichannel sEMG time series (typically acquired from the forearm) or more information-dense features [12] of the same, are, together with encodings of co-occurring movements (>) , fed to a machine learning algorithm which generates a computable function (_<>>) . This learned function represents an approximate mapping between sEMG and movement and is typically derived by selecting the free parameters (>) such that (_<>>) minimizes some loss metric (sum_mathcal(>_<>>,>>>_<>>,>)) , where (<>>_) and (<>>_) are the sEMG segment and (a numeric encoding of) the concurrent movement, respectively, at time t, and (>>>_=_<>>left(<>>_ ight)) is (a numeric encoding of) the inferred movement. Following such initial calibration, (_<>>) can be used to process previously unseen segments by recognizing movement-specific sEMG patterns an MCI based on pattern recognition can thus be understood as a form of gesture recognition system.

The contemporary engineering research literature shows no signs of scarcity when surveyed for approaches based on pattern recognition aiming to accommodate the mechanical sophistication of available robotic limbs. Algorithms from the broader machine learning discipline such as linear discriminant analysis [13] support vector machines [14] hidden Markov models [15] and decision trees/random forests [16] have, among several others, been applied for this purpose such methods have at times reached impressive classification accuracies of more than 95% for movement class sets with cardinalities exceeding 10 [17]. As in most other technical pursuits in which statistical inference plays a part, Deep Learning [18] in the form of, for example, convolutional neural networks (e.g. [19,20,21,22,23]) and recurrent neural networks (e.g. [24, 25]), has recently found widespread use in myoelectric control research [26] and has frequently attained exceptional accuracy scores. Unlike their ‘classical’ machine learning counterparts, such methods avert the need for manual feature engineering via their ability to gainfully operate directly on raw sEMG, but are often hampered by a need for time-consuming hyperparameter tuning large datasets and/or requirement on computational resources infeasible for embedded systems [27].

Independent of the minutiae of any specific algorithm, the improvements over the industrial and clinical status quo made possible by pattern recognition are quite apparent. Importantly, use of pattern recognition is congruent with complete naturalness of control: The task of mapping a detected movement attempt to a movement command corresponding to the very same movement is trivial, thus enabling an intuitive form of steering. Similarly, multiarticulate control can be realized either implicitly, by detecting separate multiarticulate movements and/or grasps as individual classes, or explicitly, by detecting each DoF separately using multi-output versions of pattern recognition [22, 28,29,30]. In spite of such alluring promises, the fact remains that remarkably few implementations of pattern recognition have so far been deployed at scale in the daily life of amputees [31].

Conjecturally, one of the main obstacles separating myoelectric pattern recognition from widespread adoption within prosthetics relates to the phenomenon of drift in the data-generating distribution (P(>|>)) from which sEMG is sampled [32]. Stated succinctly, the statistical relationship connecting measured myoelectric activity (>) to movement (>) is not necessarily identical to the relationship which was valid at the time of calibration data acquisition, making the problem a specific instance of model overfitting. Variations in electrode positions skin conductivity limb placement and load and fatigue are all examples of mechanisms which modulate the characteristics of the acquired sEMG [32], making the learned mapping (_< heta >(>)=widehat<>>) obsolete and thus degrading MCI performance over time [33]. Drift of this kind has in the past been mitigated either by including calibration data from a varied set of recording circumstances (although this approach has limitations regarding scalability [10]) or by using adaptive control strategies [34]. As will be argued in this paper, a complementary strategy is to develop methods which yield more generalizable mappings from sEMG to movement via regularization.

In addition to problems of robustness and stability of the aforementioned kind, one drawback of straightforwardly applying pattern recognition relates to proportionality of control. To make effective use of a prosthesis it is practical, and perhaps even necessary, to be able to not only transmit what movement to perform, but also to transmit information of the desired force and velocity—a capability not granted by basic pattern recognition. A naïve solution is to reformulate the classification problem as one of direct regression (of kinetics and/or kinematics), as is certainly notionally consistent [35]. However, at some point this requires ground truth measurements of relevant regressands, which in principle are impossible to acquire from prosthesis users. One way to circumvent this anatomical limitation has been the use of mirrored training [36], where sEMG from the amputation stump, collected during mediolaterally mirrored movements, is used to infer the kinematics of the contralateral, intact limb. Regression has also been realized by using continuous visual movement instruction stimuli as regressand [21], which requires the subject to manually vary the intensity of muscle contraction during acquisition of calibration data. Regardless of method, proportional interfaces have been observed to lead to higher levels of user adaptation [37], potentially due to their greater resemblance to natural motor control.

An alternative way of extending myoelectric pattern recognition into the continuous domain, that does not require continuous target measurements, is to leverage the fact that aggregated sEMG activity can be modulated volitionally, and thusly estimate movement class and intensity of contraction separately. This approach, which has been applied both in previous laboratory studies (e.g. [38,39,40]) and commercially [41], use a classifier to determine what gesture is to be performed. Following classification, the detected gesture is performed with velocity directly proportional to either (I) the concurrently estimated force of contraction (with e.g. instantaneous sEMG magnitude as proxy), or to (II) some monotonously increasing function thereof. Such functions can be tuned automatically and independently for each detectable movement, thereby accounting for systematic differences in intensity between movement classes [40]. Albeit uncomplicated and demonstrably effective, these strategies can be understood as problematic for a number of reasons. Firstly, there is no guarantee that the pattern associations learned during model calibration will be generalizable to all intensities of contraction [10], and thus some sEMG patterns might inadvertently be classified as patterns cooccurring with other movement classes. Such mistakes can plausibly lead to an MCI output perceived as erratic by the user. Secondly, proportionality mediated in this way is not simultaneous over all available DoFs, as only a single dimension of proportional information (i.e. the globally estimated intensity of myoelectric activity) is available.

In addition to developments in pattern recognition, studies of methods which are not directly based on regression or classification have demonstrated the potential of several alternative paths towards natural, simultaneous, and proportional myoelectric control. Multisite intramuscular EMG (iEMG), which can measure motor unit action potentials directly [42], has been investigated as a mechanism for direct control, and has furthermore been shown to possess functional advantages when compared to proportional pattern recognition [43]. Weakly supervised autoencoding has shown promising results in unlabelled separation of underlying sEMG signal components which can be mapped to kinematics directly [44]. Nonnegative matrix factorization has been used [45] to extract multiple simultaneous DoFs separately from rectified and filtered sEMG while retaining their respective proportionalities. Techniques for deconvoluting high-density sEMG based on models informed by neuromuscular physiology have successfully been applied towards the same end [46]. Although at the cutting edge of electrophysiology, such approaches have, possibly due to advanced modes of signal acquisition, so far mostly been constrained to the laboratory environment.

In order to aid in the pursuit of practical MCIs and to alleviate the limitations of available methods, this paper introduces a new set of methods aimed at achieving intuitive, proportional, and simultaneous myoelectric control. Concretely, the framework is constituted by a computationally lightweight neural network topology with a compatible optimization procedure, all described in detail in the Methods section. In contrast to previous frameworks based on pattern recognition, the proposed combination of techniques operates to learn nonlinear mappings from forearm sEMG to continuous and multivariate encodings of hand- and wrist kinematics, despite only being calibrated with sEMG signals labelled with categorical movement instruction stimuli. This affords the framework the advantage of regression-based approaches (i.e. proportionality) but requiring neither kinematic ground truth data nor complicated recording protocols. Additionally, by incorporating a multi-task learning formulation of the kinematic inference problem, the framework implicitly allows for independent and simultaneous control of all considered DoFs. Due to its reliance on signal representations [47] arising from supervised learning with regularizing constraints, the novel framework is referred to as myoelectric representation learning (MRL). To demonstrate the viability of MRL and to quantify differences in performance compared to the current commercial standard for pattern recognition, this paper includes experiments in which test subjects were tested for efficacy of control when using (I) MRL, and (II) pattern recognition as represented by linear discriminant analysis (LDA) [40], to perform a virtual Fitts’s law [48] type test. Furthermore, to quantify temporal deterioration of myocontrol quality, the performances of both methods were reassessed after 7 days of intermission. Interestingly, distributed representations learned by the MRL model seem insensitive to small drifts in the data-generating distribution over time, leading to a stable interface across the two usage sessions.


Access options

Buy single article

Instant access to the full article PDF.

Tax calculation will be finalised during checkout.

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

Tax calculation will be finalised during checkout.


Mechanical fault diagnosis using Convolutional Neural Networks and Extreme Learning Machine

A novel diagnosis model, integrating Convolutional Neural Networks and Extreme Learning Machine is proposed.

Weight orthogonality constraint is employed in CNN to achieve divergent feature representations.

The Extreme Learning Machine is used to improve the classification performance.

The proposed method achieves higher classification accuracy and needs less computational time.


5. Conclusions

In this study, we evaluated the effects of using sub-image patches for synthetic CT image generation for head and neck cancer patients using two different state-of-the art generative adversarial network models, namely, the pix2pix and CycleGAN models. For our independent test sets the dosimetric accuracy of both pix2pix and CycleGAN had absolute percent dose differences of 2% or less. While indicative of sufficient accuracy on a small sample size, these methods, in general, need evaluation on a larger cohort. We also found that modeling aleatoric uncertainties by combining overlapping sub-patch HU estimations may potentially aid in providing estimates of reliability in sCT generation and help to identify regions with potentially problematic domain transformations.


Download and print this article for your personal scholarly, research, and educational use.

Buy a single issue of Science for just $15 USD.

Science

Vol 372, Issue 6539
16 April 2021

Article Tools

Please log in to add an alert for this article.

By Nicholas A. Steinmetz , Cagatay Aydin , Anna Lebedeva , Michael Okun , Marius Pachitariu , Marius Bauza , Maxime Beau , Jai Bhagat , Claudia Böhm , Martijn Broux , Susu Chen , Jennifer Colonell , Richard J. Gardner , Bill Karsh , Fabian Kloosterman , Dimitar Kostadinov , Carolina Mora-Lopez , John O’Callaghan , Junchol Park , Jan Putzeys , Britton Sauerbrei , Rik J. J. van Daal , Abraham Z. Vollan , Shiwei Wang , Marleen Welkenhuysen , Zhiwen Ye , Joshua T. Dudman , Barundeb Dutta , Adam W. Hantman , Kenneth D. Harris , Albert K. Lee , Edvard I. Moser , John O’Keefe , Alfonso Renart , Karel Svoboda , Michael Häusser , Sebastian Haesler , Matteo Carandini , Timothy D. Harris

An approach has been developed that allows recording from the same neurons in a freely behaving animal for weeks and months.


2 Literature review

Numerous studies regarding online learning across higher education have been conducted, that have enhanced both the understanding and practical implications of adopting different modes of online learning, such as blended, asynchronous, and synchronous learning [15]. To determine the success of e-learning in higher education student satisfaction is an important indicator to determine performance [2, 5, 8–16]. Duque (2013) proposed a framework for evaluating higher education performance with students’ satisfaction, perceived learning outcomes, and dropout intentions, and found that dropout intentions were strongly and negatively associated with student satisfaction [32]. Meanwhile, Kuo et al. (2014) highlighted the close relationship between student satisfaction and motivation, dropout rates, success, and learning commitment [33]. Furthermore, Pham et al. (2019) have shown a positive relationship between student satisfaction and loyalty in Vietnamese adults and higher education [13]. According to the E-learning systems success (EESS) model, it has been proposed that student satisfaction is a key component to determine E-learning success [8]. Therefore, through comprehensively understanding the underlying factors influencing student satisfaction, will enable the improvement of online teaching and learning design and execution [16].

2.1 Current theories for satisfaction in E-learning

Multiple factors have been proposed that identify and influence students’ satisfaction regarding E-learning [8]. An early E-learning research model developed by DeLone and McLean (2003) was primarily based on the quality of information, systems, and services that determined user satisfaction [34]. This model has been used to compare E-learning success between male and female students in Malaysian universities during the COVID-19 pandemic [14]. Another significant approach for developing a theoretical framework in the research of E-learning is the user satisfaction approach [8]. A recent study conducted by Yawson and Yamoah (2020) adopted this approach using a 7-point Likert-scale, to measure the satisfaction of E-learning in higher education of developing countries (i.e., Ghana) [16]. Question items in their study included domains of the course design, delivery, interaction, and delivery environment. However, this study did not focus on ERL although the study period overlapped with the pandemic. Apart from the aforementioned models, other technology acceptance and E-learning quality models have been developed with an emphasis on usefulness and ease of use [8, 35]. Due to the unique characteristics, strength, and limitations in each research model, Al-Fraihat et al. (2020) has further formulated a multidimensional conceptual model for evaluating the EESS model more holistically [8].

Interestingly, a recent study by Shim and Lee (2020) developed a semi-structured questionnaire without adopting the aforementioned models to conduct a thematic analysis to investigate the colleges’ experience of ERL during the COVID-19 pandemic in South Korea [5]. Similarly, Alqurshi (2020) used a tailor-made questionnaire to measure student’s satisfaction using 5-point Likert-scale questions focusing on virtual classrooms, completion of course learning outcomes, and alternative assessments in different institutions in Saudi Arabia [10]. These previously mentioned theoretical models were built to evaluate pre-planned E-learning while the deployment of ERL during the COVID-19 pandemic was abrupt, direct use of E-learning research models may not suitably reflect the underlying factors affecting the success and satisfaction of ERL. Therefore, recently a tailor-made survey kit was developed by EDUCAUSE to allow institutions to rapidly adopt to gather feedback from higher education stakeholders [36]. Therefore, the subsequent literature review has been primarily based on the items and constructs proposed in the EDUCAUSE survey kit, while taking reference from the components of the multidimensional EESS model.

2.2 Readiness and accessibility

The first part of the EDUCAUSE survey kit (2020) focuses on technological issues and challenges during the transition to remote learning [36]. Questions included the level of discomfort and familiarity of instructors and students while using technological applications, the adequacy of digital replacements for face-to-face collaboration tools (e.g., whiteboards), and accessibility to a reliable internet connection, communication software, and specialized software and tools. According to Al-Fraihat et al., (2020), the direct association between system quality and student satisfaction was assumed in the original model of Delone and Mclean (2003) [8, 34]. Similarly, other literature also suggests that improved system quality positively influences student satisfaction when E-learning [8, 37]. In the EESS model, the technical system quality has several subset items including ease of use and learning, user requirements, and the systems features, availability, reliability, fulfillment, security, and personalization. Whereas, Al-Fraihat et al. (2020) highlighted different obstacles when adopting E-learning in developing and developed countries [8]. For example, resources, accessibility, and infrastructure are more important for developing countries while information quality and usefulness of the system are more important in developed regions. However, low-income families may also exist in developed countries, and students from relatively poor living environments may face similar problems as those living in the developing countries, although the technological infrastructure of higher education institutes is better developed.

Also, self-efficacy, defined as the individuals’ belief in their own ability to perform a certain task, challenge or successfully engage with educational technology [38, 39], showed to be interconnected with student satisfaction levels [40]. Recently, Prifti (2020) identified that the learning management system self-efficacy positively influenced student satisfaction in blended learning in Tirana of Albania while both platform content and accessibility were important constructs affecting the self-efficacy level [41]. Similarly, Geng et al. (2019) found technology readiness positively influenced learning motivation during blended learning in higher education [42]. Interestingly, Alqurashi (2018) reported conflicting findings regarding the impact of students’ self-efficacy for using technology on student satisfaction, as more recent studies suggest university students have become more competent and confident in using technology when conducting online learning [43]. However, recently Rizun et al. (2020) confirmed that self-efficacy levels did affect students’ acceptance in terms of perceived ease of use and usefulness when conducting ERL in Poland during the COVID-19 pandemic. Since the circumstances in well-planned and designed E-learning is different from ERL, it is important to assess important constructs such as accessibility and students’ readiness, including their self-efficacy to determine ERL success [44].

2.3 Instructor, assessment, and learning

Another focus in the EDUCAUSE survey kit is learning and education-related issues. Focused questions include the personal preference for face-to-face learning, assessment requirements, students’ attention to remote classes and activities, the availability and responsiveness of instructors, and if the original lessons were well translated to a remote format. Alqurashi (2018) showed the importance of quality learner-instructor interaction as two-way communication between the instructor and students [43]. Besides, his study used a multiple regression, which shown learner-content interaction was the most important predictor of student satisfaction, which further supports the findings from Kuo et al. (2014) study [33]. By providing user-friendly and accessible course materials, assists in the motivation of students’ learning and understanding, in turn leading to increased student satisfaction. Meanwhile, the authors recommended students should pay more attention to the feedback and responses from the course instructors, such as asking and answering questions, receiving feedback, and performing online discussions. Recently, Muzammil et al. (2020) demonstrated similar findings in Indonesian higher education using a structural equation model [12]. They showed that student-tutor interaction significantly contributed to the level of student engagement, whereas student satisfaction levels were greatly dictated by their engagement level. This was further demonstrated by Pham et al. (2019), who shown the instructor’s ability to deliver quality E-learning provisions, affected Vietnamese college students’ satisfaction and loyalty [13]. In their study, data regarding the perceived E-learning instructor quality from a students’ perspective were gathered via several questions focusing on instructors knowledge, responsiveness, consistency in delivering good lectures, organization, class preparation, encouragement for interactive participation, and if the instructors have the students’ best long-term interests in mind. However, recently in a review by Carpenter et al. (2020) raised the issue of students’ “illusional learning”, where well-polished lectures delivered by enthusiastic and engaging instructors can inflate students’ subjective impressions and judgments of learning [45]. Since the evaluation of teaching effectiveness and quality of teachers from the students’ point of view may have a strong bias, when designing a questionnaire concerning the instructor and E-learning for students, the focus should be placed on the familiarity in E-learning technology, responsiveness, and availability rather than teaching quality, performance and usefulness.

Based on the EESS model, the diversity in assessment materials significantly determines the educational system quality which contributes to the prediction of perceived satisfaction [8, 37]. Placing importance on assessments for predicting student satisfaction during E-learning was further supported by Hew et al. (2019) [26]. For example, using machine learning and hierarchical linear models, assessments were confirmed as a significant and important sentiment for predicting student satisfaction for MOOC (Massive Open Online Courses). Recently, Rodriguez et al. (2019) used multiple linear regression to determine assessment procedures and appropriate level of assessment demand as important predictors for student satisfaction levels in multiple universities from Andalusia, Spain [30]. When assessment-related aspects are considered while conducting ERL during a crisis, Shim and Lee (2020) identified comments regarding dissatisfaction with assessments, such as increasing the burden of final exams after the deletion of mid-term assessments, the vagueness of test evaluations, and increased quantities of assignments during COVID-19 [5]. As certain practical or tutorial classes might be moved to remote learning format or substituted by other learning activities, the change in assessment methods to accommodate such temporary shifts to ERL was necessary to match the actual learning quantity and quality of students. Therefore, the evaluation of the clarity and appropriateness of accommodated assessments seems to be essential in the prediction of students’ satisfaction for ERL during a pandemic.

2.4 Self-concerned

Referring to the EESS model, learners’ anxiety as part of the learner quality somewhat contributes to perceived satisfaction [8]. Bolliger and Halupa (2012), define anxiety as ‘the conscious fearful emotional state’ and further proposed the close relations between computer, internet, and online course anxiety [46]. In their study, a significant but negative association between student anxiety and satisfaction was detected, given several anxiety-related aspects such as performance insecurity, hesitation, and nervousness were proposed to closely link with student satisfaction. However, as Alqurashi (2018) emphasized the high computer and online learning competency of students nowadays, may limit the findings from prior studies addressing the effects of computer and internet related anxiety on students perceived satisfaction [43]. Therefore, suggested questions in the self-concern section of the EDUCAUSE survey kit include other aspects potentially related to ERL, such as the worry about course performance or grade, the concern of lesser interaction with classmates and instructors, a potential delay of graduation or completion of the program, privacy, and food or housing security [36].

2.5 The application of multiple regression in predicting student satisfaction

Numerous researchers have used statistical methods to analyze satisfaction scores, perceived learning, interaction, self-efficacy, and other factors related to online learning [33, 43, 47–51]. For example, multiple linear regressions have been used to produce several predictive models for examining and comparing the interaction and amount of variance explained for different predictors on student satisfaction [43]. Multiple linear regression contains more than one independent variable (X1,…,Xp). It can be regarded as the expansion of a simple linear regression studying straight line mathematics with Y = β0 + β1X where β0 is the intercept and β1 is the slope. This statistical method has been widely used because of its simple algorithm and mathematical calculation [43, 52, 53]. Previous studies have shown its strong predictive power in applications but the estimated regression coefficients can be greatly affected if high correlations between predictors exist as the multicollinearity issue [54]. Apart from the simple linear regression, a hierarchical linear model was commonly used to deal with more complicated data with nested nature [26]. Meanwhile, stepwise multiple regression including the combination of the forward and backward selection techniques was widely adopted for high efficiency using the minimum number of important predictors to build a successful prediction model. However, numerous studies have pointed out the potential flaws using stepwise regression such as multicollinearity, overfitting, and the selection of nuisance variables rather than useful variables [55, 56]. Since only numerical variables are allowed for building predictive models in multiple linear regressions, categorical predictors including nominal and ordinal variables must be converted to binary code using dummy variables before modeling.

2.6 The use of machine learning

Opposed to multiple linear regressions, other machine learning methods under the umbrella of artificial intelligence are increasingly used for predictive purposes [52, 57–63]. The advantage of machine learning is the ability to use both categorical and numerical predictors to generate models through assessing linear and non-linear relationships between variables, and the importance of each predictor. Common machine learning algorithms for predicting numerical outcomes using regressors have been widely studied and adopted for applications of different contexts such as K-nearest neighbor (KNN) [57], support vector regression (SVR) [58, 61], an ensemble of decision trees with random forest (RF) [60], gradient boosting method (GBM) [62, 63], multilayer perceptron regression (MLPR) simulating the structure and operation of human neural network architecture [52], and elastic net (ENet) [64].

The KNN is a nonparametric method used to provide a query point for making predictions. Through computing the Euclidean distance between that point and all points in the training data set, the closest K training data points are picked. While the prediction is achieved by averaging the target output values for K points [57]. It is a simple machine learning method and easy to tune for optimization.

The SVR was developed as a supervised machine learning technique for handling classification problems. The SVR was later extended from the original support vector machine algorithm for solving multivariate regression problems [57, 61]. By constructing a set of hyperplanes in high-dimensional space, SVR makes the non-linear separable problem to be linearly separable [57, 61]. Therefore, SVR is a good option for solving problems with high dimensional data with a lesser risk of overfitting, though it is sensitive to outliers and very time-consuming in training with large datasets.

The RF is a non-parametric method using an ensemble of decision trees with the voting of the most popular class while the results from trees are aggregated as the final output. In training the RF model, a multitude of decision trees is constructed using a collection of random variables [59, 60]. Random forest is broadly applicable to different populations due to being fast and efficient in generating predictions, with only a few parameters required to tune for model optimization. Moreover, it can be used for high-dimensional problems and provide feature importance for further analysis.

Unlike other tree-based machine learning techniques using level-wise learning to grow the tree vertically, LightGBM is an improved form of a gradient boosting algorithm. It uses a leaf-wise tree-based approach for enhancing the scalability and efficiency, with the lesser computational time required, and without sacrificing the model accuracy. Recent studies have shown excellent predictive performance in different data [62, 63].

The MLPR is a form of feedforward artificial neural network, simulating the structure and asynchronous activity of the human nervous system. With the input, hidden and output layers of nodes, neurons can perform nonlinear activation functions and distinguish non-linear data for supervised machine learning models [52, 57].

The ENet method was initially developed to simulate an elastic fishing net to retain “all the big fish”, through automatic selection of predictors and continuous shrinkage. While the selection of a group of correlated variables is allowed, it provides both features and benefits of “ridge” and “least absolute shrinkage and selection operator (LASSO) regression”. This was regarded as an improved form of multiple linear regression using ordinary least squares [65]. Recent studies demonstrated superior performance in using ENet over other regression methods in handling multicollinearity of predictors for numerical predictions [66, 67].

There is no best machine learning or statistical method for prediction accuracy, given the different structure and nature of datasets, including the number of variables, dimensionality, and cardinality of predictors, that can substantially influence the accuracy of each algorithm [52, 57, 60–63, 67]. Although previous studies have shown machine learning algorithms to outperform multiple linear regressions, especially in handling complicated models or datasets with high complexity, most machine learning methods are black box in nature and uninterpretable [68–70]. Consequently, the trade-off between prediction accuracy and capability in model explanation has become controversial for making decisions in using simple and transparent models like multiple linear regression or potentially more accurate but complicated black-box machine learning models. Recently Abu Saa et al. (2019) have highlighted the frequent use of machine learning techniques for educational data mining including Decision Trees, Naïve Bayes, artificial neural networks, support vector machine, and logistic regression [31]. Therefore, the use of machine learning algorithms in solving educational research problems such as student satisfaction can be a future exploratory direction.

2.7 Feature selection before building predictive models

To successfully build predictive models, feature selection is a critical and frequently used technique in both the field of statistics and machine learning to choose a subset of attributes from original features. This process attempts to reduce the high-dimensional feature space through the removal of redundant and irrelevant predictors and only select highly relevant features to enhance model performance [56, 71]. In addition to the automated selection process in stepwise regression, recursive feature elimination (RFE) is another commonly used feature selection method. Through repeatedly eliminating features in the lowest rank regarding the relevancy and comparing the corresponding model accuracy after each RFE iteration, the subset of features/predictors is finalized for formulating the optimal model. In this regard, previous studies have shown the beneficial effect of wide-ranging RFE approaches in enhancing the prediction accuracy when building classification or regression predictive models after noise variables removed [71–74].


Synergy Across AI and Other Fields

The promise of AI lies in applications that combine aspects of each of the above subfields with other elements of computing such as database management and signal processing. 7 The increasing potential of AI in surgery is analogous to other recent technological developments (e.g. mobile phones, cloud computing) that have arisen from the intersection of hyper-cycle advances in both hardware and software (i.e. as hardware advances, so too does software and vice versa).

Synergy between fields is also important in expanding the applications of AI. Combining NLP and computer vision, Google (Mountain View, CA, USA) Image Search is able to display relevant pictures in response to a textual query such as a word or phrase. Furthermore, neural networks, specifically deep learning, now form a significant part of the architecture underlying various AI systems. For example, deep learning in NLP has allowed for significant improvements in the accuracy of translation (60% more accurate translation by Google Translate 31 ) while its use in computer vision has resulted in greater accuracy of classification of images (42% more accurate image classification by AlexNet 32 ).

Clinical applications of such work include the successful utilization of deep learning to create a computer vision algorithm for the classification of smartphone images of benign and malignant skin lesions at an accuracy level equivalent to dermatologists. 33 NLP and ML analyses of postoperative colorectal patients demonstrated that prediction of anastomotic leaks improved to 92% accuracy when different data types were analyzed in concert instead of individually (accuracy of vital signs – 65% lab values – 74% text data – 83%). 34

Early attempts at using AI for technical skills augmentation focused on small feats such as task deconstruction and autonomous performance of simple tasks (e.g. suturing, knot-tying). 35, 36 Such efforts have been critical to establishing a foundation of knowledge for more complex AI tasks. 37 For example, the Smart Tissue Autonomous Robot (STAR) developed by Johns Hopkins University was equipped with algorithms that allowed it to match or outperform human surgeons in autonomous ex-vivo and in-vivo bowel anastomosis in animal models. 38

While truly autonomous robotic surgery will remain out of reach for some time, synergy across fields will likely accelerate the capabilities of AI in augmenting surgical care. For AI, much of its clinical potential is in its ability to analyze combinations of structured and unstructured data (e.g. EMR notes, vitals, laboratory values, video, and other aspects of 𠇋ig data”) to generate clinical decision support. Each type of data could be analyzed independently or in concert with different types of algorithms to yield innovations.

The true potential of AI remains to be seen and could be difficult to predict at this time. Synergistic reactions between different technologies can lead to unanticipated revolutionary technology for example, recent synergistic combinations of advanced robotics, computer vision, and neural networks led to the advent of autonomous cars. Similarly, independent components within AI and other fields could combine to create a force multiplier effect with unanticipated changes to healthcare delivery. Therefore, surgeons should be engaged in assessing the quality and applicability of AI advances to ensure appropriate translation to the clinical sector.


Background

Muscle-computer interfaces (MCIs) have found use in a broad range of clinical and biotechnical domains [1]. Most salient within the category of clinical applications is perhaps the field of hand- and wrist prosthetics, where myoelectrically controlled prostheses have been part of clinical routine since the 1960s [2]. In this application, electromyography (EMG) signals are processed by an MCI and transformed into movement commands intended to modulate the behaviour of a powered actuator, i.e. a robotic replacement limb. The prototypical system [3] designed to this end utilizes a sparse setup of surface EMG (sEMG) electrodes which measure the activities of a single antagonistic muscle pair located superficially in the residual limb of the amputee. The difference in some measure of intensity (e.g. signal magnitude) between the sEMG signals from the pair can thereafter be mapped directly to the force driving a single motorized degree of freedom (DoF) which is typically instantiated as the grasp aperture of a hand-replacing gripper. Within this framework, the additional DoFs possessed by multifunctional prostheses (which have recently become more available to hand- and arm amputees [4]) must be controlled sequentially by use of auxiliary protocols, e.g. based on co-contraction [5] or non-EMG inputs [6], for DoF switching. The enduring preponderance of this direct control framework can be understood in light of the robustness brought about by the relative simplicity of the relevant hard- and software, as well as the ease with which the intensity of contraction of a single muscle group can be controlled volitionally. However, disadvantages such as limited dexterity, lack of intuitiveness, and an associated cognitive burden have been observed among users [7] these are thought to be among the main reasons for the high abandonment rates by which devices controlled in this way are afflicted [8].

The divide that separates the direct control paradigm from advances seen in mechatronics has for a time spurred research into potential alternatives. A noteworthy candidate to this end is the use of myoelectric pattern recognition [9,10,11]—a class of methods which formulates the control problem as one of supervised machine learning. Within this framework, example segments (>) of a multichannel sEMG time series (typically acquired from the forearm) or more information-dense features [12] of the same, are, together with encodings of co-occurring movements (>) , fed to a machine learning algorithm which generates a computable function (_<>>) . This learned function represents an approximate mapping between sEMG and movement and is typically derived by selecting the free parameters (>) such that (_<>>) minimizes some loss metric (sum_mathcal(>_<>>,>>>_<>>,>)) , where (<>>_) and (<>>_) are the sEMG segment and (a numeric encoding of) the concurrent movement, respectively, at time t, and (>>>_=_<>>left(<>>_ ight)) is (a numeric encoding of) the inferred movement. Following such initial calibration, (_<>>) can be used to process previously unseen segments by recognizing movement-specific sEMG patterns an MCI based on pattern recognition can thus be understood as a form of gesture recognition system.

The contemporary engineering research literature shows no signs of scarcity when surveyed for approaches based on pattern recognition aiming to accommodate the mechanical sophistication of available robotic limbs. Algorithms from the broader machine learning discipline such as linear discriminant analysis [13] support vector machines [14] hidden Markov models [15] and decision trees/random forests [16] have, among several others, been applied for this purpose such methods have at times reached impressive classification accuracies of more than 95% for movement class sets with cardinalities exceeding 10 [17]. As in most other technical pursuits in which statistical inference plays a part, Deep Learning [18] in the form of, for example, convolutional neural networks (e.g. [19,20,21,22,23]) and recurrent neural networks (e.g. [24, 25]), has recently found widespread use in myoelectric control research [26] and has frequently attained exceptional accuracy scores. Unlike their ‘classical’ machine learning counterparts, such methods avert the need for manual feature engineering via their ability to gainfully operate directly on raw sEMG, but are often hampered by a need for time-consuming hyperparameter tuning large datasets and/or requirement on computational resources infeasible for embedded systems [27].

Independent of the minutiae of any specific algorithm, the improvements over the industrial and clinical status quo made possible by pattern recognition are quite apparent. Importantly, use of pattern recognition is congruent with complete naturalness of control: The task of mapping a detected movement attempt to a movement command corresponding to the very same movement is trivial, thus enabling an intuitive form of steering. Similarly, multiarticulate control can be realized either implicitly, by detecting separate multiarticulate movements and/or grasps as individual classes, or explicitly, by detecting each DoF separately using multi-output versions of pattern recognition [22, 28,29,30]. In spite of such alluring promises, the fact remains that remarkably few implementations of pattern recognition have so far been deployed at scale in the daily life of amputees [31].

Conjecturally, one of the main obstacles separating myoelectric pattern recognition from widespread adoption within prosthetics relates to the phenomenon of drift in the data-generating distribution (P(>|>)) from which sEMG is sampled [32]. Stated succinctly, the statistical relationship connecting measured myoelectric activity (>) to movement (>) is not necessarily identical to the relationship which was valid at the time of calibration data acquisition, making the problem a specific instance of model overfitting. Variations in electrode positions skin conductivity limb placement and load and fatigue are all examples of mechanisms which modulate the characteristics of the acquired sEMG [32], making the learned mapping (_< heta >(>)=widehat<>>) obsolete and thus degrading MCI performance over time [33]. Drift of this kind has in the past been mitigated either by including calibration data from a varied set of recording circumstances (although this approach has limitations regarding scalability [10]) or by using adaptive control strategies [34]. As will be argued in this paper, a complementary strategy is to develop methods which yield more generalizable mappings from sEMG to movement via regularization.

In addition to problems of robustness and stability of the aforementioned kind, one drawback of straightforwardly applying pattern recognition relates to proportionality of control. To make effective use of a prosthesis it is practical, and perhaps even necessary, to be able to not only transmit what movement to perform, but also to transmit information of the desired force and velocity—a capability not granted by basic pattern recognition. A naïve solution is to reformulate the classification problem as one of direct regression (of kinetics and/or kinematics), as is certainly notionally consistent [35]. However, at some point this requires ground truth measurements of relevant regressands, which in principle are impossible to acquire from prosthesis users. One way to circumvent this anatomical limitation has been the use of mirrored training [36], where sEMG from the amputation stump, collected during mediolaterally mirrored movements, is used to infer the kinematics of the contralateral, intact limb. Regression has also been realized by using continuous visual movement instruction stimuli as regressand [21], which requires the subject to manually vary the intensity of muscle contraction during acquisition of calibration data. Regardless of method, proportional interfaces have been observed to lead to higher levels of user adaptation [37], potentially due to their greater resemblance to natural motor control.

An alternative way of extending myoelectric pattern recognition into the continuous domain, that does not require continuous target measurements, is to leverage the fact that aggregated sEMG activity can be modulated volitionally, and thusly estimate movement class and intensity of contraction separately. This approach, which has been applied both in previous laboratory studies (e.g. [38,39,40]) and commercially [41], use a classifier to determine what gesture is to be performed. Following classification, the detected gesture is performed with velocity directly proportional to either (I) the concurrently estimated force of contraction (with e.g. instantaneous sEMG magnitude as proxy), or to (II) some monotonously increasing function thereof. Such functions can be tuned automatically and independently for each detectable movement, thereby accounting for systematic differences in intensity between movement classes [40]. Albeit uncomplicated and demonstrably effective, these strategies can be understood as problematic for a number of reasons. Firstly, there is no guarantee that the pattern associations learned during model calibration will be generalizable to all intensities of contraction [10], and thus some sEMG patterns might inadvertently be classified as patterns cooccurring with other movement classes. Such mistakes can plausibly lead to an MCI output perceived as erratic by the user. Secondly, proportionality mediated in this way is not simultaneous over all available DoFs, as only a single dimension of proportional information (i.e. the globally estimated intensity of myoelectric activity) is available.

In addition to developments in pattern recognition, studies of methods which are not directly based on regression or classification have demonstrated the potential of several alternative paths towards natural, simultaneous, and proportional myoelectric control. Multisite intramuscular EMG (iEMG), which can measure motor unit action potentials directly [42], has been investigated as a mechanism for direct control, and has furthermore been shown to possess functional advantages when compared to proportional pattern recognition [43]. Weakly supervised autoencoding has shown promising results in unlabelled separation of underlying sEMG signal components which can be mapped to kinematics directly [44]. Nonnegative matrix factorization has been used [45] to extract multiple simultaneous DoFs separately from rectified and filtered sEMG while retaining their respective proportionalities. Techniques for deconvoluting high-density sEMG based on models informed by neuromuscular physiology have successfully been applied towards the same end [46]. Although at the cutting edge of electrophysiology, such approaches have, possibly due to advanced modes of signal acquisition, so far mostly been constrained to the laboratory environment.

In order to aid in the pursuit of practical MCIs and to alleviate the limitations of available methods, this paper introduces a new set of methods aimed at achieving intuitive, proportional, and simultaneous myoelectric control. Concretely, the framework is constituted by a computationally lightweight neural network topology with a compatible optimization procedure, all described in detail in the Methods section. In contrast to previous frameworks based on pattern recognition, the proposed combination of techniques operates to learn nonlinear mappings from forearm sEMG to continuous and multivariate encodings of hand- and wrist kinematics, despite only being calibrated with sEMG signals labelled with categorical movement instruction stimuli. This affords the framework the advantage of regression-based approaches (i.e. proportionality) but requiring neither kinematic ground truth data nor complicated recording protocols. Additionally, by incorporating a multi-task learning formulation of the kinematic inference problem, the framework implicitly allows for independent and simultaneous control of all considered DoFs. Due to its reliance on signal representations [47] arising from supervised learning with regularizing constraints, the novel framework is referred to as myoelectric representation learning (MRL). To demonstrate the viability of MRL and to quantify differences in performance compared to the current commercial standard for pattern recognition, this paper includes experiments in which test subjects were tested for efficacy of control when using (I) MRL, and (II) pattern recognition as represented by linear discriminant analysis (LDA) [40], to perform a virtual Fitts’s law [48] type test. Furthermore, to quantify temporal deterioration of myocontrol quality, the performances of both methods were reassessed after 7 days of intermission. Interestingly, distributed representations learned by the MRL model seem insensitive to small drifts in the data-generating distribution over time, leading to a stable interface across the two usage sessions.


Short-Term Power Prediction of Building Integrated Photovoltaic (BIPV) System Based on Machine Learning Algorithms

One of the biggest challenges is towards ensuring large-scale integration of photovoltaic systems into buildings. This work is aimed at presenting a building integrated photovoltaic system power prediction concerning the building’s various orientations based on the machine learning data science tools. The proposed prediction methodology comprises a data quality stage, machine learning algorithm, weather clustering assessment, and an accuracy assessment. The results showed that the application of linear regression coefficients to the forecast outputs of the developed photovoltaic power generation neural network improved the PV power generation’s forecast output. The final model resulted from accurate forecasts, exhibiting a root mean square error of 4.42% in NN, 16.86% in QSVM, and 8.76% in TREE. The results are presented with the building facade and roof application such as flat roof, south façade, east façade, and west façade.

1. Introduction

Economic growth has given rise to increasing global demand for electrical energy production and consumption. Solar power plants are very common in renewable energy sources [1–4]. Satellite technology allows us to fly around the world. In addition to being easily installed on the roof of the building, PV modules will act as stand-alone solar power generators [5–7]. The installation of photovoltaic panels has increased every year in recent years. Globally, 117 gigawatts of solar PV energy are generated in 2019 [8]. Traditional grid-based power distribution also operates on stable power supply lines and a consistent load [9]. Grid efficiency can be enhanced by controlling both suppliers and customers. Solar PV power could interfere with conventional power generation, making conventional generation uncomfortable or even unworkable [10–14].

Machine learning has become more common in forecasting and classification because it reliably processes complex or nonlinear problems. They can distinguish the relationship between input and output variables, even when the representation is unlikely [15]. The most common are artificial neural network (ANN) [16], fuzzy logic (FL) [17], support vector machine (SVM) [18], K-nearest neighbor algorithm (kNN) [19], and decision tree- (DT-) [20] based techniques (including random forest (RF) [21]. Specifically, artificial intelligence approaches are discussed in detail to improve photovoltaic performance forecasting models [22]. IRT is the most commonly accepted technique for categorising light poles [23], centred on image processing techniques to distinguish between healthy and defective panels of all image processing-based approaches. Various patterns, challenges, and opportunities for the implementation of ANN light poles are highlighted [24]. Random forests were the most reliable among the various forecasting techniques used by the site and regional forecasters [25]. Several mathematical models were developed to increase the accuracy of diagnoses [26]. Also, the use of PHANN for clear days of the sky resulted in a standard deviation of 5.3%. In [27], the authors used the recurrent neural network (LSTM-RNN) to predict future PV generation, with RMSE results of approximately 82.15 W to 136.87 W, respectively, for two separate datasets. Most PV forecasts now have a relative RMSE of more than 10% [28].

The input weights and the hidden layer biases of ELM are allocated. Randomised and then measure the least square instead of iteration approaches. The ELM will help to gain information and better transition between situations. Capability of many ELM-based models has been presented, and their excellent capacity has been verified in predicting PV power production [29]. The results are using ensemble method combining lower upper bound estimation (LUBE) and ensemble learning methods. Considering the degree of convergence and prediction precision [30], it is worked to combine ELM with the entropy method to build a hybrid forecast method for short-term PV power production, which is preferable to the radial basis function neural network and the generalised Deep Learning Network [31]. The alternative multimodel is based on ELM for PV power predictive. Forecasting is essential for operating power plants and other utilities [32]. The feasibility-prospective forecasting model is also developed and proved effective in predicting the short-term power production of PV systems. Certain parameters are allocated at random in traditional artificial neural networks (ANN), causing a certain degree of error and uncertainty in the prediction performance. Several artificial neural networks (ANNs) have been integrated with Generalised Annihilation (GA) to solve this problem. It is stated that a convolutional neural network system was effective in predicting solar irradiance, where the GA was applied to optimise the associated hyperparameters. However, in conventional ANNs, many parameters must be carefully optimised to establish learning strategies [33].

The algorithms are developed to predict PV power generation [34]. Time series have been decomposed into high- and low-frequency components. The DBN model will then be used to provide high-frequency trends. Finally, the forecasted trend components are summarised in the final results. GA is the algorithm most commonly used to solve nonlinear optimisation problems. Genetic Algorithm (GA) Theory of Evolution and Calculation of Individual Fitness Functions is predicted for evolutionary theory. The GA involves the iterative selection of elite individuals, crossover applications, and mutations [35]. support vector machine (SVM) to predict short-term solar PV power and optimise SVM parameters using the Meta-SVM Optimizer [36].

Since SVM uses Quadratic Programming, SVM training takes a long time when the number of items is large. Energy resource forecasting based on neural networks is very good at predicting solar power due to its strong task scheduling (-threshold) and outstanding mapping capability. The combined method for predicting PV power consumption with ANN and analogue integration is investigated [37]. ELM is developed and built using a neural feed-forward network (FNN). ELM can train without altering its weights and thresholds. It is characterised by rapid training speed and strong generalisation capability and strong applications [38]. The ELM model can effectively address complex nonlinear regression problems. This model has already been used to predict the radiance and power output of PV systems. The specific objective of explaining PV power fluctuations using a graphical method based on the ELM model has been reported [39]. A similarity-based photovoltaic module power prediction model is developed using the available historical data [40]. ANN model is used to predict monthly global solar radiation concerning the power predication based on the geographical location [41]. The performance of the photovoltaic module varies concerning the geographical location. The PV system’s prediction is based on the machine learning algorithm developed, and the stability of the model is validated [42]. The algorithm is developed to predict the grid’s load dispatch connected photovoltaic system for the microgrid [43].

The PV output power is determined using various algorithms to predict accuracy. A short-term, day-ahead power prediction is analysed and a long-term, day-ahead power prediction concerning climate condition. A study on the building-integrated PV system is required to be undertaken. In this study, the PV output is normalised based on experimental studies. The machine learning algorithm is used to predict the efficiency of the building integrated photovoltaic system concerning the various orientations. The systems being installed are flat, south and east oriented, and west oriented façades. The artificial neural network, decision tree, and quadratic support vector machine algorithm predict the BIPV system’s performance short-term power prediction.

2. Machine Learning Algorithms

State-of-the-art solar power technology will only be established if forecasters can predict how much solar power will be available at a specific location at a given time. The built model can be replicated since it includes only environmental data without regard to geographical locations. The machine learning models are developed with three types of training, validation, and test set depending on the design’s nature. The work flow chart is shown in Figure 1.

2.1. Artificial Neural Network

Artificial neural networks (ANNs) can describe nonlinear, complex, and incremental behaviours through input-output training patterns. An ANN characterisation is based on an architecture that shows the connections between nodes, the determination of weights methods, and the activation function.

Artificial neural networks’ ability to learn from large samples makes it neural networks possible to solve several major and complex problems [44]. The most common neural structure of the network is the feed-forward structure. A typical neural network is made up of different computational components called neurons. The input and output layer’s weights and biases are mutually optimised until the output neuron values are within possibility of weeding out false reasoning errors. This approach was successfully applied in response to regression problems [45]. This feed-forward network model is presented in Figures 2 and 3. Each layer has a hidden layer and many nodes in the hidden layer, while the user-defined function types are shown in Table 1. ANN methods can handle nonlinear systems. Still, problems of overfitting, local minima, random initial data, intensive training data requirements, and increased complexity due to multilayered architectures are the limitations [46, 47].


This article offers a formal account of curiosity and insight in terms of active (Bayesian) inference. It deals with the dual problem of inferring states of the world and learning its statistical structure. In contrast to current trends in machine learning (e.g., deep learning), we focus on how people attain insight and understanding using just a handful of observations, which are solicited through curious behavior. We use simulations of abstract rule learning and approximate Bayesian inference to show that minimizing (expected) variational free energy leads to active sampling of novel contingencies. This epistemic behavior closes explanatory gaps in generative models of the world, thereby reducing uncertainty and satisfying curiosity. We then move from epistemic learning to model selection or structure learning to show how abductive processes emerge when agents test plausible hypotheses about symmetries (i.e., invariances or rules) in their generative models. The ensuing Bayesian model reduction evinces mechanisms associated with sleep and has all the hallmarks of “aha” moments. This formulation moves toward a computational account of consciousness in the pre-Cartesian sense of sharable knowledge (i.e., con: “together” scire: “to know”).

This article presents a formal (computational) description of epistemic behavior that calls on two themes in theoretical neurobiology. The first is the use of Bayesian principles for understanding the nature of intelligent and purposeful behavior (Koechlin, Ody, & Kouneiher, 2003 Oaksford & Chater, 2003 Coltheart, Menzies, & Sutton, 2010 Nelson, McKenzie, Cottrell, & Sejnowski, 2010 Collins & Koechlin, 2012 Solway & Botvinick, 2012 Donoso, Collins, & Koechlin, 2014 Seth, 2014 Koechlin, 2015 Lu, Rojas, Beckers, & Yuille, 2016). The second is the role of self-modeling, reflection, and sleep (Metzinger, 2003 Hobson, 2009). In particular, we formulate curiosity and insight in terms of inference—namely, the updating of beliefs about how our sensations are caused. Our focus is on the transitions from states of ignorance to states of insight—namely, states with (i.e., con) awareness (i.e., scire) of causal contingencies. We associate these epistemic transitions with the process of Bayesian model selection and the emergence of insight. In short, we try to show that resolving uncertainty about the world, through active inference, necessarily entails curious behavior and consequent ‘aha’ or eureka moments.

The basic theme of this article is that one can cast learning, inference, and decision making as processes that resolve uncertainty about the world. This theme is central to many issues in psychology, cognitive neuroscience, neuroeconomics, and theoretical neurobiology, which we consider in terms of curiosity and insight. The purpose of this article is not to review the large literature in these fields or provide a synthesis of established ideas (e.g., Schmidhuber, 1991 Oaksford & Chater, 2001 Koechlin et al., 2003 Botvinick & An, 2008 Nelson et al., 2010 Navarro & Perfors, 2011 Tenenbaum, Kemp, Griffiths, & Goodman, 2011 Botvinick & Toussaint, 2012 Collins & Koechlin, 2012 Solway & Botvinick, 2012 Donoso et al., 2014). Our purpose is to show that the issues this diverse literature addresses can be accommodated by a single imperative (minimization of expected free energy, or resolution of uncertainty) that already explains many other phenomena–for example, decision making under uncertainty, stochastic optimal control, evidence accumulation, addiction, dopaminergic responses, habit learning, reversal learning, devaluation, saccadic searches, scene construction, place cell activity, omission-related responses, mismatch negativity, P300 responses, phase-precession, and theta-gamma coupling (Friston, FitzGerald et al., 2016 Friston, FitzGerald, Rigoli, Schwartenbeck, & Pezzulo, 2017). In what follows, we ask how the resolution of uncertainty might explain curiosity and insight.

1.1 Curiosity

Curiosity is an important concept in many fields, including psychology (Berlyne, 1950, 1954 Loewenstein, 1994), computational neuroscience, and robotics (Schmidhuber, 1991 Oaksford & Chater, 2001). Much of neural development can be understood as learning contingencies about the world and how we can act on the world (Saegusa, Metta, Sandini, Sakka, 2009 Nelson et al., 2010 Nelson, Divjak, Gudmundsdottir, Martignon, & Meder, 2014). This learning rests on intrinsically motivated curious behavior that enables us to predict the consequences of our actions: as nicely summarized by Still and Precup (2012), “A learner should choose a policy that also maximizes the learner's predictive power. This makes the world both interesting and exploitable.” This epistemic, world-disclosing perspective speaks to the notion of optimal data selection and important questions about how rational or optimal we are in querying our world (Oaksford, Chater, Larkin, 2000 Oaksford & Chater, 2003). Clearly, the epistemic imperatives behind curiosity are especially prescient in developmental psychology and beyond: ”In the absence of external reward, babies and scientists and others explore their world. Using some sort of adaptive predictive world model, they improve their ability to answer questions such as what happens if I do this or that?” (Schmidhuber, 2006). In neurorobotics, these imperatives are often addressed in terms of active learning (Markant & Gureckis, 2014 Markant, Settles, & Gureckis, 2016), with a focus on intrinsic motivation (Baranes & Oudeyer, 2009). Active learning and intrinsic motivation are also key concepts in educational psychology, where they play an important role in enabling insight and understanding (Eccles & Wigfield, 2002).

1.2 Insight and Eureka Moments

The Eureka effect (Auble, Franks, & Soraci, 1979) was introduced to psychology by comparing the recall for sentences that were initially confusing but subsequently understood. The implicit resolution of confusion appears to be the main determinant of recall and the emotional concomitants of insight (Shen, Yuan, Liu, & Luo, 2016). Several psychological theories for solving insight problems have been proposed—for example, progress monitoring and representational change theory (Knoblich, Ohlsson, & Raney, 2001 MacGregor, Ormerod, & Chronicle, 2001). Both enjoy empirical support, largely from eye movement studies (Jones, 2003). Furthermore, several psychophysical and neuroimaging studies have attempted to clarify the functional anatomy of insight (see Bowden, Jung-Beeman, Fleck, & Kounios, 2005), for a psychological review and Dresler et al., 2015, for a review of the neural correlates of insight in dreaming and psychosis). In what follows, we offer a normative framework that complements psychological theories by describing how curiosity engenders insight. Our treatment is framed by two questions posed by Berlyne (1954) in his seminal treatment of curiosity: ”The first question is why human beings devote so much time and effort to the acquisition of knowledge. … The second question is why, out of the infinite range of knowable items in the universe, certain pieces of knowledge are more ardently sought and more readily retained than others?” (p. 180).

In brief, we will try to show that the acquisition of knowledge and its retention are emergent properties of active inference—specifically, that curiosity manifests as an active sampling of the world to minimize uncertainty about hypotheses—or explanations—for states of the world, while retention of knowledge entails the Bayesian model selection of the most plausible explanation. The first process rests on curious, evidence-accumulating, uncertainty-resolving behavior, while the second operates on knowledge structures (i.e., generative models) after evidence has been accumulated.

Our approach rests on the free energy principle, which asserts that any sentient creature must minimize the entropy of its sensory exchanges with the world. Mathematically, entropy is uncertainty or expected surprise, where surprise can be expressed as a free energy function of sensations and (Bayesian) beliefs about their causes. This suggests that creatures are compelled to minimize uncertainty or expected free energy. In what follows, we will see that resolving different sorts of uncertainty furnishes principled explanations for different sorts of behavior. These levels of uncertainty pertain to plausible states of the world, plausible policies that change those states, and plausible models of those changes.

The first level of uncertainty is about the causes of sensory outcomes under a particular policy (i.e., sequence of actions). Reducing this sort of uncertainty corresponds to perceptual inference (a.k.a. state estimation). In other words, the first thing we need to do is infer the current state of the world and the context in which we are operating. We then have to contend with uncertainty about policies per se that can be cast in terms of uncertainty about future states of the world, outcomes, and the probabilistic contingencies that bind them. We will see that minimizing these three forms of expected surprise—by choosing an uncertainty resolving policy—corresponds to information-seeking epistemic behavior, goal-seeking pragmatic behavior, and novelty-seeking curious behavior, respectively. In short, by pursuing the best policy, we accumulate experience and reduce uncertainty about probabilistic contingencies through epistemic learning—namely, inferring (the parameters of our models of) how outcomes are generated.

Finally, curious, novelty-seeking policies enable us to reduce our uncertainty about our generative models per se, leading to structure learning, insight, and understanding. Here, a generative model constitutes a hypothesis about how observable outcomes are generated, where we entertain competing hypotheses that are, a priori, equally plausible. In short, the last level of uncertainty reduction entails the selection of models that render outcomes the least surprising, having suppressed all other forms of uncertainty. All but the last process require experience to resolve uncertainty about either the states (inference) or parameters (learning) of a particular model. However, optimization of the model per se can proceed in a fact-free, or outcome-free, fashion, using experience accumulated to date. In other words, no further facts or outcomes are necessary for this last level of optimization: facts and outcomes are constitutive of the experience on which this optimization relies. It is this Bayesian model selection we associate with fact-free learning (Aragones, Gilboa, Postlewaite, & Schmeidler, 2005) and the emergence of insight (Bowden et al., 2005).


Synergy Across AI and Other Fields

The promise of AI lies in applications that combine aspects of each of the above subfields with other elements of computing such as database management and signal processing. 7 The increasing potential of AI in surgery is analogous to other recent technological developments (e.g. mobile phones, cloud computing) that have arisen from the intersection of hyper-cycle advances in both hardware and software (i.e. as hardware advances, so too does software and vice versa).

Synergy between fields is also important in expanding the applications of AI. Combining NLP and computer vision, Google (Mountain View, CA, USA) Image Search is able to display relevant pictures in response to a textual query such as a word or phrase. Furthermore, neural networks, specifically deep learning, now form a significant part of the architecture underlying various AI systems. For example, deep learning in NLP has allowed for significant improvements in the accuracy of translation (60% more accurate translation by Google Translate 31 ) while its use in computer vision has resulted in greater accuracy of classification of images (42% more accurate image classification by AlexNet 32 ).

Clinical applications of such work include the successful utilization of deep learning to create a computer vision algorithm for the classification of smartphone images of benign and malignant skin lesions at an accuracy level equivalent to dermatologists. 33 NLP and ML analyses of postoperative colorectal patients demonstrated that prediction of anastomotic leaks improved to 92% accuracy when different data types were analyzed in concert instead of individually (accuracy of vital signs – 65% lab values – 74% text data – 83%). 34

Early attempts at using AI for technical skills augmentation focused on small feats such as task deconstruction and autonomous performance of simple tasks (e.g. suturing, knot-tying). 35, 36 Such efforts have been critical to establishing a foundation of knowledge for more complex AI tasks. 37 For example, the Smart Tissue Autonomous Robot (STAR) developed by Johns Hopkins University was equipped with algorithms that allowed it to match or outperform human surgeons in autonomous ex-vivo and in-vivo bowel anastomosis in animal models. 38

While truly autonomous robotic surgery will remain out of reach for some time, synergy across fields will likely accelerate the capabilities of AI in augmenting surgical care. For AI, much of its clinical potential is in its ability to analyze combinations of structured and unstructured data (e.g. EMR notes, vitals, laboratory values, video, and other aspects of 𠇋ig data”) to generate clinical decision support. Each type of data could be analyzed independently or in concert with different types of algorithms to yield innovations.

The true potential of AI remains to be seen and could be difficult to predict at this time. Synergistic reactions between different technologies can lead to unanticipated revolutionary technology for example, recent synergistic combinations of advanced robotics, computer vision, and neural networks led to the advent of autonomous cars. Similarly, independent components within AI and other fields could combine to create a force multiplier effect with unanticipated changes to healthcare delivery. Therefore, surgeons should be engaged in assessing the quality and applicability of AI advances to ensure appropriate translation to the clinical sector.


2 Literature review

Numerous studies regarding online learning across higher education have been conducted, that have enhanced both the understanding and practical implications of adopting different modes of online learning, such as blended, asynchronous, and synchronous learning [15]. To determine the success of e-learning in higher education student satisfaction is an important indicator to determine performance [2, 5, 8–16]. Duque (2013) proposed a framework for evaluating higher education performance with students’ satisfaction, perceived learning outcomes, and dropout intentions, and found that dropout intentions were strongly and negatively associated with student satisfaction [32]. Meanwhile, Kuo et al. (2014) highlighted the close relationship between student satisfaction and motivation, dropout rates, success, and learning commitment [33]. Furthermore, Pham et al. (2019) have shown a positive relationship between student satisfaction and loyalty in Vietnamese adults and higher education [13]. According to the E-learning systems success (EESS) model, it has been proposed that student satisfaction is a key component to determine E-learning success [8]. Therefore, through comprehensively understanding the underlying factors influencing student satisfaction, will enable the improvement of online teaching and learning design and execution [16].

2.1 Current theories for satisfaction in E-learning

Multiple factors have been proposed that identify and influence students’ satisfaction regarding E-learning [8]. An early E-learning research model developed by DeLone and McLean (2003) was primarily based on the quality of information, systems, and services that determined user satisfaction [34]. This model has been used to compare E-learning success between male and female students in Malaysian universities during the COVID-19 pandemic [14]. Another significant approach for developing a theoretical framework in the research of E-learning is the user satisfaction approach [8]. A recent study conducted by Yawson and Yamoah (2020) adopted this approach using a 7-point Likert-scale, to measure the satisfaction of E-learning in higher education of developing countries (i.e., Ghana) [16]. Question items in their study included domains of the course design, delivery, interaction, and delivery environment. However, this study did not focus on ERL although the study period overlapped with the pandemic. Apart from the aforementioned models, other technology acceptance and E-learning quality models have been developed with an emphasis on usefulness and ease of use [8, 35]. Due to the unique characteristics, strength, and limitations in each research model, Al-Fraihat et al. (2020) has further formulated a multidimensional conceptual model for evaluating the EESS model more holistically [8].

Interestingly, a recent study by Shim and Lee (2020) developed a semi-structured questionnaire without adopting the aforementioned models to conduct a thematic analysis to investigate the colleges’ experience of ERL during the COVID-19 pandemic in South Korea [5]. Similarly, Alqurshi (2020) used a tailor-made questionnaire to measure student’s satisfaction using 5-point Likert-scale questions focusing on virtual classrooms, completion of course learning outcomes, and alternative assessments in different institutions in Saudi Arabia [10]. These previously mentioned theoretical models were built to evaluate pre-planned E-learning while the deployment of ERL during the COVID-19 pandemic was abrupt, direct use of E-learning research models may not suitably reflect the underlying factors affecting the success and satisfaction of ERL. Therefore, recently a tailor-made survey kit was developed by EDUCAUSE to allow institutions to rapidly adopt to gather feedback from higher education stakeholders [36]. Therefore, the subsequent literature review has been primarily based on the items and constructs proposed in the EDUCAUSE survey kit, while taking reference from the components of the multidimensional EESS model.

2.2 Readiness and accessibility

The first part of the EDUCAUSE survey kit (2020) focuses on technological issues and challenges during the transition to remote learning [36]. Questions included the level of discomfort and familiarity of instructors and students while using technological applications, the adequacy of digital replacements for face-to-face collaboration tools (e.g., whiteboards), and accessibility to a reliable internet connection, communication software, and specialized software and tools. According to Al-Fraihat et al., (2020), the direct association between system quality and student satisfaction was assumed in the original model of Delone and Mclean (2003) [8, 34]. Similarly, other literature also suggests that improved system quality positively influences student satisfaction when E-learning [8, 37]. In the EESS model, the technical system quality has several subset items including ease of use and learning, user requirements, and the systems features, availability, reliability, fulfillment, security, and personalization. Whereas, Al-Fraihat et al. (2020) highlighted different obstacles when adopting E-learning in developing and developed countries [8]. For example, resources, accessibility, and infrastructure are more important for developing countries while information quality and usefulness of the system are more important in developed regions. However, low-income families may also exist in developed countries, and students from relatively poor living environments may face similar problems as those living in the developing countries, although the technological infrastructure of higher education institutes is better developed.

Also, self-efficacy, defined as the individuals’ belief in their own ability to perform a certain task, challenge or successfully engage with educational technology [38, 39], showed to be interconnected with student satisfaction levels [40]. Recently, Prifti (2020) identified that the learning management system self-efficacy positively influenced student satisfaction in blended learning in Tirana of Albania while both platform content and accessibility were important constructs affecting the self-efficacy level [41]. Similarly, Geng et al. (2019) found technology readiness positively influenced learning motivation during blended learning in higher education [42]. Interestingly, Alqurashi (2018) reported conflicting findings regarding the impact of students’ self-efficacy for using technology on student satisfaction, as more recent studies suggest university students have become more competent and confident in using technology when conducting online learning [43]. However, recently Rizun et al. (2020) confirmed that self-efficacy levels did affect students’ acceptance in terms of perceived ease of use and usefulness when conducting ERL in Poland during the COVID-19 pandemic. Since the circumstances in well-planned and designed E-learning is different from ERL, it is important to assess important constructs such as accessibility and students’ readiness, including their self-efficacy to determine ERL success [44].

2.3 Instructor, assessment, and learning

Another focus in the EDUCAUSE survey kit is learning and education-related issues. Focused questions include the personal preference for face-to-face learning, assessment requirements, students’ attention to remote classes and activities, the availability and responsiveness of instructors, and if the original lessons were well translated to a remote format. Alqurashi (2018) showed the importance of quality learner-instructor interaction as two-way communication between the instructor and students [43]. Besides, his study used a multiple regression, which shown learner-content interaction was the most important predictor of student satisfaction, which further supports the findings from Kuo et al. (2014) study [33]. By providing user-friendly and accessible course materials, assists in the motivation of students’ learning and understanding, in turn leading to increased student satisfaction. Meanwhile, the authors recommended students should pay more attention to the feedback and responses from the course instructors, such as asking and answering questions, receiving feedback, and performing online discussions. Recently, Muzammil et al. (2020) demonstrated similar findings in Indonesian higher education using a structural equation model [12]. They showed that student-tutor interaction significantly contributed to the level of student engagement, whereas student satisfaction levels were greatly dictated by their engagement level. This was further demonstrated by Pham et al. (2019), who shown the instructor’s ability to deliver quality E-learning provisions, affected Vietnamese college students’ satisfaction and loyalty [13]. In their study, data regarding the perceived E-learning instructor quality from a students’ perspective were gathered via several questions focusing on instructors knowledge, responsiveness, consistency in delivering good lectures, organization, class preparation, encouragement for interactive participation, and if the instructors have the students’ best long-term interests in mind. However, recently in a review by Carpenter et al. (2020) raised the issue of students’ “illusional learning”, where well-polished lectures delivered by enthusiastic and engaging instructors can inflate students’ subjective impressions and judgments of learning [45]. Since the evaluation of teaching effectiveness and quality of teachers from the students’ point of view may have a strong bias, when designing a questionnaire concerning the instructor and E-learning for students, the focus should be placed on the familiarity in E-learning technology, responsiveness, and availability rather than teaching quality, performance and usefulness.

Based on the EESS model, the diversity in assessment materials significantly determines the educational system quality which contributes to the prediction of perceived satisfaction [8, 37]. Placing importance on assessments for predicting student satisfaction during E-learning was further supported by Hew et al. (2019) [26]. For example, using machine learning and hierarchical linear models, assessments were confirmed as a significant and important sentiment for predicting student satisfaction for MOOC (Massive Open Online Courses). Recently, Rodriguez et al. (2019) used multiple linear regression to determine assessment procedures and appropriate level of assessment demand as important predictors for student satisfaction levels in multiple universities from Andalusia, Spain [30]. When assessment-related aspects are considered while conducting ERL during a crisis, Shim and Lee (2020) identified comments regarding dissatisfaction with assessments, such as increasing the burden of final exams after the deletion of mid-term assessments, the vagueness of test evaluations, and increased quantities of assignments during COVID-19 [5]. As certain practical or tutorial classes might be moved to remote learning format or substituted by other learning activities, the change in assessment methods to accommodate such temporary shifts to ERL was necessary to match the actual learning quantity and quality of students. Therefore, the evaluation of the clarity and appropriateness of accommodated assessments seems to be essential in the prediction of students’ satisfaction for ERL during a pandemic.

2.4 Self-concerned

Referring to the EESS model, learners’ anxiety as part of the learner quality somewhat contributes to perceived satisfaction [8]. Bolliger and Halupa (2012), define anxiety as ‘the conscious fearful emotional state’ and further proposed the close relations between computer, internet, and online course anxiety [46]. In their study, a significant but negative association between student anxiety and satisfaction was detected, given several anxiety-related aspects such as performance insecurity, hesitation, and nervousness were proposed to closely link with student satisfaction. However, as Alqurashi (2018) emphasized the high computer and online learning competency of students nowadays, may limit the findings from prior studies addressing the effects of computer and internet related anxiety on students perceived satisfaction [43]. Therefore, suggested questions in the self-concern section of the EDUCAUSE survey kit include other aspects potentially related to ERL, such as the worry about course performance or grade, the concern of lesser interaction with classmates and instructors, a potential delay of graduation or completion of the program, privacy, and food or housing security [36].

2.5 The application of multiple regression in predicting student satisfaction

Numerous researchers have used statistical methods to analyze satisfaction scores, perceived learning, interaction, self-efficacy, and other factors related to online learning [33, 43, 47–51]. For example, multiple linear regressions have been used to produce several predictive models for examining and comparing the interaction and amount of variance explained for different predictors on student satisfaction [43]. Multiple linear regression contains more than one independent variable (X1,…,Xp). It can be regarded as the expansion of a simple linear regression studying straight line mathematics with Y = β0 + β1X where β0 is the intercept and β1 is the slope. This statistical method has been widely used because of its simple algorithm and mathematical calculation [43, 52, 53]. Previous studies have shown its strong predictive power in applications but the estimated regression coefficients can be greatly affected if high correlations between predictors exist as the multicollinearity issue [54]. Apart from the simple linear regression, a hierarchical linear model was commonly used to deal with more complicated data with nested nature [26]. Meanwhile, stepwise multiple regression including the combination of the forward and backward selection techniques was widely adopted for high efficiency using the minimum number of important predictors to build a successful prediction model. However, numerous studies have pointed out the potential flaws using stepwise regression such as multicollinearity, overfitting, and the selection of nuisance variables rather than useful variables [55, 56]. Since only numerical variables are allowed for building predictive models in multiple linear regressions, categorical predictors including nominal and ordinal variables must be converted to binary code using dummy variables before modeling.

2.6 The use of machine learning

Opposed to multiple linear regressions, other machine learning methods under the umbrella of artificial intelligence are increasingly used for predictive purposes [52, 57–63]. The advantage of machine learning is the ability to use both categorical and numerical predictors to generate models through assessing linear and non-linear relationships between variables, and the importance of each predictor. Common machine learning algorithms for predicting numerical outcomes using regressors have been widely studied and adopted for applications of different contexts such as K-nearest neighbor (KNN) [57], support vector regression (SVR) [58, 61], an ensemble of decision trees with random forest (RF) [60], gradient boosting method (GBM) [62, 63], multilayer perceptron regression (MLPR) simulating the structure and operation of human neural network architecture [52], and elastic net (ENet) [64].

The KNN is a nonparametric method used to provide a query point for making predictions. Through computing the Euclidean distance between that point and all points in the training data set, the closest K training data points are picked. While the prediction is achieved by averaging the target output values for K points [57]. It is a simple machine learning method and easy to tune for optimization.

The SVR was developed as a supervised machine learning technique for handling classification problems. The SVR was later extended from the original support vector machine algorithm for solving multivariate regression problems [57, 61]. By constructing a set of hyperplanes in high-dimensional space, SVR makes the non-linear separable problem to be linearly separable [57, 61]. Therefore, SVR is a good option for solving problems with high dimensional data with a lesser risk of overfitting, though it is sensitive to outliers and very time-consuming in training with large datasets.

The RF is a non-parametric method using an ensemble of decision trees with the voting of the most popular class while the results from trees are aggregated as the final output. In training the RF model, a multitude of decision trees is constructed using a collection of random variables [59, 60]. Random forest is broadly applicable to different populations due to being fast and efficient in generating predictions, with only a few parameters required to tune for model optimization. Moreover, it can be used for high-dimensional problems and provide feature importance for further analysis.

Unlike other tree-based machine learning techniques using level-wise learning to grow the tree vertically, LightGBM is an improved form of a gradient boosting algorithm. It uses a leaf-wise tree-based approach for enhancing the scalability and efficiency, with the lesser computational time required, and without sacrificing the model accuracy. Recent studies have shown excellent predictive performance in different data [62, 63].

The MLPR is a form of feedforward artificial neural network, simulating the structure and asynchronous activity of the human nervous system. With the input, hidden and output layers of nodes, neurons can perform nonlinear activation functions and distinguish non-linear data for supervised machine learning models [52, 57].

The ENet method was initially developed to simulate an elastic fishing net to retain “all the big fish”, through automatic selection of predictors and continuous shrinkage. While the selection of a group of correlated variables is allowed, it provides both features and benefits of “ridge” and “least absolute shrinkage and selection operator (LASSO) regression”. This was regarded as an improved form of multiple linear regression using ordinary least squares [65]. Recent studies demonstrated superior performance in using ENet over other regression methods in handling multicollinearity of predictors for numerical predictions [66, 67].

There is no best machine learning or statistical method for prediction accuracy, given the different structure and nature of datasets, including the number of variables, dimensionality, and cardinality of predictors, that can substantially influence the accuracy of each algorithm [52, 57, 60–63, 67]. Although previous studies have shown machine learning algorithms to outperform multiple linear regressions, especially in handling complicated models or datasets with high complexity, most machine learning methods are black box in nature and uninterpretable [68–70]. Consequently, the trade-off between prediction accuracy and capability in model explanation has become controversial for making decisions in using simple and transparent models like multiple linear regression or potentially more accurate but complicated black-box machine learning models. Recently Abu Saa et al. (2019) have highlighted the frequent use of machine learning techniques for educational data mining including Decision Trees, Naïve Bayes, artificial neural networks, support vector machine, and logistic regression [31]. Therefore, the use of machine learning algorithms in solving educational research problems such as student satisfaction can be a future exploratory direction.

2.7 Feature selection before building predictive models

To successfully build predictive models, feature selection is a critical and frequently used technique in both the field of statistics and machine learning to choose a subset of attributes from original features. This process attempts to reduce the high-dimensional feature space through the removal of redundant and irrelevant predictors and only select highly relevant features to enhance model performance [56, 71]. In addition to the automated selection process in stepwise regression, recursive feature elimination (RFE) is another commonly used feature selection method. Through repeatedly eliminating features in the lowest rank regarding the relevancy and comparing the corresponding model accuracy after each RFE iteration, the subset of features/predictors is finalized for formulating the optimal model. In this regard, previous studies have shown the beneficial effect of wide-ranging RFE approaches in enhancing the prediction accuracy when building classification or regression predictive models after noise variables removed [71–74].


Mechanical fault diagnosis using Convolutional Neural Networks and Extreme Learning Machine

A novel diagnosis model, integrating Convolutional Neural Networks and Extreme Learning Machine is proposed.

Weight orthogonality constraint is employed in CNN to achieve divergent feature representations.

The Extreme Learning Machine is used to improve the classification performance.

The proposed method achieves higher classification accuracy and needs less computational time.


Access options

Buy single article

Instant access to the full article PDF.

Tax calculation will be finalised during checkout.

Subscribe to journal

Immediate online access to all issues from 2019. Subscription will auto renew annually.

Tax calculation will be finalised during checkout.


Download and print this article for your personal scholarly, research, and educational use.

Buy a single issue of Science for just $15 USD.

Science

Vol 372, Issue 6539
16 April 2021

Article Tools

Please log in to add an alert for this article.

By Nicholas A. Steinmetz , Cagatay Aydin , Anna Lebedeva , Michael Okun , Marius Pachitariu , Marius Bauza , Maxime Beau , Jai Bhagat , Claudia Böhm , Martijn Broux , Susu Chen , Jennifer Colonell , Richard J. Gardner , Bill Karsh , Fabian Kloosterman , Dimitar Kostadinov , Carolina Mora-Lopez , John O’Callaghan , Junchol Park , Jan Putzeys , Britton Sauerbrei , Rik J. J. van Daal , Abraham Z. Vollan , Shiwei Wang , Marleen Welkenhuysen , Zhiwen Ye , Joshua T. Dudman , Barundeb Dutta , Adam W. Hantman , Kenneth D. Harris , Albert K. Lee , Edvard I. Moser , John O’Keefe , Alfonso Renart , Karel Svoboda , Michael Häusser , Sebastian Haesler , Matteo Carandini , Timothy D. Harris

An approach has been developed that allows recording from the same neurons in a freely behaving animal for weeks and months.


5. Conclusions

In this study, we evaluated the effects of using sub-image patches for synthetic CT image generation for head and neck cancer patients using two different state-of-the art generative adversarial network models, namely, the pix2pix and CycleGAN models. For our independent test sets the dosimetric accuracy of both pix2pix and CycleGAN had absolute percent dose differences of 2% or less. While indicative of sufficient accuracy on a small sample size, these methods, in general, need evaluation on a larger cohort. We also found that modeling aleatoric uncertainties by combining overlapping sub-patch HU estimations may potentially aid in providing estimates of reliability in sCT generation and help to identify regions with potentially problematic domain transformations.


Watch the video: Differentiable Physics Simulations for Deep Learning (August 2022).