1 Introduction
The purpose of this paper is to establish the notion of vectorvalued reproducing kernel Banach spaces and demonstrate its applications to multitask machine learning. Built on the theory of scalarvalued reproducing kernel Hilbert spaces (RKHS) [3], kernel methods have been proven successful in single task machine learning [10, 14, 29, 30, 33]. Multitask learning where the unknown target function to be learned from finite sample data is vectorvalued appears more often in practice. References [13, 25] proposed the development of kernel methods for learning multiple related tasks simultaneously. The mathematical foundation used there was the theory of vectorvalued RKHS [5, 27]. Recent progresses in vectorvalued RKHS can be found in [7, 8, 9]. In such a framework, both the space of the candidate functions used for approximation and the output space are chosen as a Hilbert space.
There are some occasions where it might be desirable to select the space of candidate functions, the output space, or both as Banach spaces. Hilbert spaces constitute a special and limited class of Banach spaces. Any two Hilbert spaces over a common number field with the same dimension are isometrically isomorphic. By reaching out to other Banach spaces, one obtains more variety in geometric structures and norms that are potentially useful for learning and approximation. Moreover, training data might come with intrinsic structures that make them impossible or inappropriate to be embedded into a Hilbert space. Learning schemes based on features in a Hilbert space may not work well for them. Finally, in some applications, a Banach space norm is engaged for some particular purpose. A typical example is the linear programming regularization in coefficient based regularization for machine learning
[29], where the norm is employed to obtain sparsity in the resulting minimizer.There have been considerable work in learning a single task with Banach spaces (see, for example, [4, 6, 12, 15, 17, 20, 24, 26, 34, 39, 41]). The difficulty in mapping patterns into a Banach space and making use of these features for learning mainly lies in the lack of an inner product in Banach spaces. In particular, without an appropriate correspondence of the Riesz representation of continuous linear functionals, point evaluations do not have a kernel representation in these studies. Semiinner products, a mathematical tool discovered by Lumer [23] for the purpose of extending Hilbert space type arguments to Banach spaces, seem to be a natural substitute for inner products in Banach spaces. An illustrative example is that we were able to extend the classical theory of frames and Riesz bases to Banach spaces via semiinner products [38]. Semiinner products were first used to machine learning by Der and Lee [12]
for the study of large margin classification by hyperplanes in a Banach space. With this tool, we established the notion of scalarvalued reproducing kernel Banach spaces (RKBS) and investigated regularized learning schemes in RKBS
[36, 37]. There has been increasing interest in the application of this new theory [40, 19, 31, 32].We attempt to build a mathematical foundation for multitask learning with Banach spaces. Specifically, we shall propose a definition of vectorvalued RKBS and investigate its fundamental properties in the next section. Feature map representations and several concrete examples of vectorvalued RKBS will be presented in Sections 3 and 4, respectively. In Section 5, we investigate regularized learning schemes in vectorvalued RKBS.
2 Definition and Basic Properties
We are concerned with spaces of functions from a fixed set to a vector space. We shall allow the space of functions and the range space both to be a Banach space. Our key tool in dealing with a general Banach space is the semiinner product [16, 23]. Recall that a semiinner product on a Banach space is a function from to , denoted by , such that for all and

(linearity with respect to the first variable) ;

(positivity) for ;

(conjugate homogeneity with respect to the second variable) ;

(CauchySchwartz inequality) .
A semiinner product on is said to be compatible if
where denotes the norm on . Every Banach space has a compatible semiinner product [16, 23]. Let be a compatible semiinner product on . Then one sees by the CauchySchwartz inequality that for each , the linear functional on defined by
(2.1) 
is bounded on . In other words, lies in the dual space of . Moreover, we have
(2.2) 
and
(2.3) 
Introduce the duality mapping from to by setting
We desire to represent the continuous linear functionals on the vectorvalued RKBS to be introduced by the semiinner product. However, the semiinner product might not be able to fulfill this important role for an arbitrary Banach space. For instance, one verifies that the continuous linear functional
on endowed with the usual maximum norm can not be represented as
for any compatible semiinner product on and any .
The above example indicates that the duality mapping might not be surjective for a general Banach space. Other problems such as nonuniqueness of compatible semiinner products and noninjectivity of the duality mapping may also occur. To overcome these difficulties, we shall focus on Banach spaces that are uniformly convex and uniformly Fréchet differentiable in this preliminary work on vectorvalued RKBS. A Banach space is uniformly convex if for all there exists a such that
Uniform convexity ensures the injectivity of the duality mapping and the existence and uniqueness of the best approximation to a closed convex subset of [16]. We also say that is uniformly Fréchet differentiable if for all
(2.4) 
exists and the limit is approached uniformly for all in the unit ball of . If is uniformly Fréchet differentiable then it has a unique compatible semiinner product [16]. The differentiability (2.4) of the norm is useful to derive characterization equations for the minimizer of regularized learning schemes in Banach spaces. For simplicity, we call a Banach space uniform if it is both uniformly convex and uniformly Fréchet differentiable. An analogue of the Riesz representation theorem holds for uniform Banach spaces.
Lemma 2.1
(Giles [16]) Let be a uniform Banach space. Then it has a unique compatible semiinner product and the duality mapping is bijective from to . In other words, for each there exists a unique such that
In this case,
(2.5) 
defines a compatible semiinner product on .
Let be a uniform Banach space. We shall always denote by the unique compatible semiinner product on . By Lemma 2.1 and equation (2.2), the duality mapping is bijective and isometric from to . It is also conjugate homogeneous by property 3 of semiinner products. However, it is nonadditive unless reduces to a Hilbert space. As a consequence, a compatible semiinner product is in general conjugate homogeneous but nonadditive with respect to its second variable. Namely,
in general.
We are ready to present the definition of vectorvalued RKBS. Let be a Banach space which we shall sometimes call the output space and be a prescribed set which is usually called the input space. A space is called a Banach space of valued functions on if it consists of certain functions from to and the norm on is compatible with point evaluations in the sense that
For instance, , is not a Banach space of functions while is. We restrict our consideration to Banach spaces of functions so that point evaluations (usually referred to as “sampling” in applications) are welldefined.
Definition 2.2
We call a valued RKBS on if both and are uniform and is a Banach space of functions from to such that for every , the point evaluation defined by
is continuous from to .
We shall derive a reproducing kernel for so defined a vectorvalued RKBS. Throughout the rest of the paper, we let and be the unique semiinner product and and the associated duality mapping on and , respectively. For two Banach spaces , we denote by the set of all the bounded operators from to and the subset of of those bounded operators that are also linear. When , is abbreviated as . For each , we denote by the greatest lower bound of all the nonnegative constants such that
When is also linear, this quantity equals the operator norm of in . In those languages, we require that the point evaluation on a valued RKBS on belong to for all .
Theorem 2.3
Let be a valued RKBS on . Then there exists a unique function from to such that
 (1)

for all and ,
 (2)

for all , , and
(2.6)  (3)

for all
(2.7)
Proof: Let and . As , we see that
(2.8) 
The above inequality together with the linearity of the semiinner product with respect to its first variable implies that
is a bounded linear functional on . By Lemma 2.1, there exists a unique function such that
(2.9) 
Define a function from to the set of operators from to by setting
Clearly, satisfies the two requirements (1) and (2). It is also unique by the uniqueness of the function satisfying (2.9). It remains to show that it is bounded. To this end, we get by (2.8) that
It follows that
which proves (2.7).
We call the above function the reproducing kernel of . It coincides with the usual reproducing kernel when is a Hilbert space and , and with the vectorvalued reproducing kernel when both and are Hilbert spaces. We explore basic properties of vectorvalued RKBS and its reproducing kernels for further investigation and applications.
Let be the adjoint operator of for all . Denote for a Banach space by the bilinear form on defined by
Thus, is define by
(2.10) 
Proposition 2.4
Let be a valued RKBS on and its reproducing kernel. Then there holds for all and that
(2.11) 
(2.12) 
(2.13) 
(2.14) 
(2.15) 
(2.16) 
(2.17) 
Proof: By (2.6),
(2.18) 
which proves the first inequality in equation (2.11). For the second one, we use the CauchySchwartz inequality of semiinner products to get that
Turning to (2.13), we notice for each that
which together with (2.6) confirms (2.13). Since the duality mappings are conjugate homogeneous, we have by (2.13) that
which implies (2.14).
Recall that the duality mappings and are isometric. Note also that a bounded linear operator and its adjoint have equal operator norms. Using these two facts, we obtain from equation (2.13) that
which is the first inequality in (2.15). The second one follows immediately from (2.18).
For the last property, let us assume that there exists some that vanishes on . Then
which implies that for all . As is a Banach space of functions, as a vector in the Banach space . Therefore, (2.17) is true. The proof is complete.
We observe by the above proposition that the reproducing kernel of a vectorvalued RKBS enjoys many properties similar to those of the reproducing kernel of a vectorvalued RKHS. However, there are many significant differences due to the nature of a semiinner product. Firstly, although for all , remains a homogeneous bounded operator on , it is generally nonadditive. This can be seen from (2.13), where or is nonadditive. Secondly, it is wellknown that when is a Hilbert space, a function is the reproducing kernel of some valued RKHS on if and only if for all finite and pairwise distinct , ,
(2.19) 
Although (2.19) still holds for the reproducing kernel of a vectorvalued RKBS when and the number field is , it may cease to be true once the number of sampling points exceeds . An example will be constructed in the next section. Finally, the denseness property (2.17) in the dual space does not necessarily imply that
(2.20) 
A negative example will also be given in the next section after we present a construction of vectorvalued RKBS through feature maps. Before that, we present another important property of a vectorvalued RKBS.
Proposition 2.5
Let be a valued RKBS on . Suppose that , converges to some then converges to in the topology of for each . The convergence is uniform on the set where is bounded.
Proof: Suppose that converges as tends to infinity. We get by (2.15) that
Therefore, converges pointwise to on and the convergence is uniform on the set where is bounded.
3 Feature Map Representations
Feature map representations form the most important way of expressing reproducing kernels. To introduce feature maps for the reproducing kernel of a vectorvalued RKBS, we need the notion of the generalized adjoint [22] of a bounded linear operator between Banach spaces. Let be two uniform Banach spaces with the compatible semiinner products and , respectively. The generalized adjoint of a is an operator in defined by
It can be identified that
Thus, is indeed bounded as
We are in a position to present a characterization of the reproducing kernel of a vectorvalued RKBS.
Theorem 3.1
A function is the reproducing kernel of some valued RKBS on if and only if there exists a uniform Banach space and a mapping such that
(3.1) 
and
(3.2) 
Here is the function from to defined by , .
Proof: Suppose that is the reproducing kernel of some valued RKBS on . Set and define by
To identify , we observe by the reproducing property (2.6) for all and that
which implies that for all and . Requirement (3.2) is fulfilled by (2.17). By the forms of and , we obtain that
which proves (3.1).
On the other hand, suppose that is of the form (3.1) in terms of some mapping satisfying the denseness condition (3.2). We shall construct the RKBS that takes as its reproducing kernel. For this purpose, we let be composed of functions from to of the following form
Since each is a linear operator, is a linear vector space. We impose a norm on by setting
To verify that this is a welldefined norm, it suffices to show that the representer of a function is unique. Assume that . Then for all and ,
which combined with (3.2) implies that . The arguments also show that is a Banach space of functions. Moreover, it is a uniform Banach space as it is isometrically isomorphic to . Clearly, we have for each and that
which shows that point evaluations are bounded on . We conclude that is a valued RKBS on . It remains to prove that is the reproducing kernel of . To this end, we identify the unique compatible semiinner product on as
and observe for all and that
which is what we want. The proof is complete.
We call the Banach space and the mapping in Theorem 3.1 a pair of feature space and feature map for , respectively. The proof of Theorem 3.1 contains a construction of vectorvalued RKBS by feature maps, which we pull out separately as a corollary below.
Corollary 3.2
As an interesting application of Corollary 3.2, we shall show that a vectorvalued RKBS is always isometrically isomorphic to a scalarvalued RKBS on a different input space.
Corollary 3.3
If is a valued RKBS on then the following linear vector space of complexvalued functions on of the form
is an RKBS on with the norm
and the compatible semiinner product
The reproducing kernel of is
Proof: It suffices to point out that is constructed by Corollary 3.2 via the choices
The feature map satisfies the denseness condition by (2.17).
We shall next construct by Corollary 3.2 simple vectorvalued RKBS to show that the reproducing kernel of a general vectorvalued RKBS might not satisfy (2.19) or (2.20). Let satisfy that
(3.3) 
Here, for the sake of convenience in enumerating elements from a finite set, we set for . For each and , denotes the Banach space of all vectors with the norm
The space is a uniform Banach space with the compatible semiinner product
The dual element of is hence given by
(3.4) 
Noncompleteness of the linear span of the reproducing kernel in . We give a counterexample of (2.20) first. Let . We choose the output space and feature space as and , respectively. Thus, we have that and . The input space will be chosen as a set of discrete points . A feature map should satisfy the denseness condition (3.2). We note by the definition of the generalized adjoint that this condition is equivalent to
(3.5) 
where for all .
Let us take a close look at equation (2.20). By Corollary 3.2, a general function in is of the form for some . Equation (2.20) does not hold true if and only if there exists a nontrivial such that
which in turn is equivalent to that is not dense in . We conclude that to construct a valued RKBS for which (2.20) is not true, it suffices to find a feature map that satisfies (3.5) but
(3.6) 
To this end, we find a sequence of vectors and set
(3.7) 
where is the first component of the vector . Since for each , is a linear operator from to and both the spaces are finitedimensional, is bounded. We reformulate (3.5) and (3.6) to get that they are respectively equivalent to
(3.8) 
and
(3.9) 
Here for a vector , we get by (3.4) that
Therefore, the task reduces to the searching of an nonsingular matrix that becomes singular when we apply the function to each of its components. We find two such matrices as shown below
Nonpositivedefiniteness of the reproducing kernel of . We shall give an example to show that (2.19) might not hold true for the reproducing kernel of a vectorvalued RKBS when the number of sampling points exceeds . In fact, we let and be constructed as in the above example with to be appropriately chosen in the definition (3.7) of . Our purpose is to find and , such that
(3.10) 
We first note for all that
Comments
There are no comments yet.