Dazu¶
Dazu is a powerfull engine dialogue engine with two main parts: NLU
and dialogue
. The main objetive of this project is too use existing chatbots projects and uses it to develop your solution.
The name was inspired by Dazu Ausubel because the main objective of this project was to build a collaborative platform to maintain Bots for learning.
Inspired by Watson Assistant and Rasa.
Getting Started¶
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
Docker:
docker-compose up --build
Editable mode:
pip install -e .
cd examples/my-first-bot
dazu train
dazu run
After that you should see this output:
dazu run
* Serving Flask app "dazu.__main__" (lazy loading)
* Environment: production
WARNING: Do not use the development server in a production environment.
Use a production WSGI server instead.
* Debug mode: off
* Running on http://0.0.0.0:5000/ (Press CTRL+C to quit)
Prerequisites¶
To build, first install all necessary dependencies:
All the dependencies can be find in requirements.txt
and development in requirements-dev.txt
.
Installing Development Environment¶
A step by step installation guide:
Run these commands to install
dazu
in your python virtual env:
pip install -r requirements-dev.txt
pip install -e .
Go to examples folder and start the project:
cd examples/my-first-bot
dazu train
dazu run
Have fun :rocket:
Code Style¶
To ensure a standardized code style we use the formatter black. To ensure our type annotations are correct we use the type checker pytype. If your code is not formatted properly or doesn’t type check, travis will fail to build.
Formatting¶
If you want to automatically format your code on every commit, you can use pre-commit.
Just install it via pip install pre-commit
and execute pre-commit install
in the root folder.
This will add a hook to the repository, which reformats files on every commit.
If you want to set it up manually, install black via pip install -r requirements-dev.txt
.
To reformat files execute
make formatter
Type Checking¶
If you want to check types on the codebase, install pytype
using pip install -r requirements-dev.txt
.
To check the types execute
make types
Running the tests¶
Still needed
Deployment¶
Still needed
Contributing¶
Please read CONTRIBUTING.md for details on our code of conduct, and the process for submitting pull requests to us.
Versioning¶
We use SemVer for versioning. For the versions available, see the tags on this repository.
Authors¶
Raphael Pinto - Creator - ralphg6
See also the list of contributors who participated in this project.
Acknowledgments¶
Arthur Temporim - arthurTemporim
Welcome to Dazu’s documentation!¶
dazu package¶
Subpackages¶
Submodules¶
dazu.config module¶
-
class
dazu.config.
DazuConfig
(configuration_values: Optional[Dict[str, Any]] = None)¶ Bases:
object
-
as_dict
() → Dict[str, Any]¶
-
property
component_names
¶
-
for_component
(index, defaults=None) → Dict[str, Any]¶
-
get
(key: str, default: Any = None) → Any¶
-
items
() → List[Any]¶
-
override
(config) → None¶
-
set_component_attr
(index, **kwargs) → None¶
-
view
() → str¶
-
-
exception
dazu.config.
InvalidConfigError
(message: str)¶ Bases:
ValueError
Raised if an invalid configuration is encountered.
-
dazu.config.
component_config_from_pipeline
(index: int, pipeline: List[Dict[str, Any]], defaults: Optional[Dict[str, Any]] = None) → Dict[str, Any]¶
-
dazu.config.
load
(config: Union[str, Dict, None] = None, **kwargs: Any) → dazu.config.DazuConfig¶
-
dazu.config.
override_defaults
(defaults: Optional[Dict[str, Any]], custom: Optional[Dict[str, Any]]) → Dict[str, Any]¶
dazu.constants module¶
dazu.registry module¶
-
class
dazu.registry.
Registry
¶ Bases:
object
-
classmethod
get
(moduleName: str = None, moduleCls: Type[dazu.typing.module.Module] = None, prefix: str = '', default=None) → Any¶
-
classmethod
getAdapter
(config: dazu.config.DazuConfig, adapterName: str = None)¶
-
modules
= {'dialogue_simple': <class 'dazu.components.dialogue.simple.SimpleDialogue'>, 'nlu_simple': <class 'dazu.components.nlu.simple.SimpleNLU'>}¶
-
classmethod
registry
(module: Type[dazu.typing.module.Module], prefix: str = '')¶
-
classmethod
registryAdapter
(adapter: Type[dazu.typing.module.Module])¶
-
classmethod
-
dazu.registry.
pipeline_template
(s: str) → Optional[List[Dict[str, Any]]]¶
dazu.version module¶
Module contents¶
dazu.adapters package¶
Subpackages¶
Submodules¶
dazu.adapters.adapter module¶
-
class
dazu.adapters.adapter.
Adapter
¶ Bases:
dazu.typing.module.Module
Adapter Class documentation
-
abstract classmethod
input
(payload: Dict) → dazu.typing.message.Message¶
-
abstract classmethod
output
(message: dazu.typing.message.Message) → Dict¶
-
classmethod
validate_data
(payload: Dict) → bool¶ validate_data method documentation
-
abstract classmethod
-
class
dazu.adapters.adapter.
MessageAdapter
¶ Bases:
dazu.adapters.adapter.Adapter
-
classmethod
input
(payload: Dict) → dazu.typing.message.Message¶
-
classmethod
name
()¶ The name property is a function of the class - its __name__.
-
classmethod
output
(message: dazu.typing.message.Message) → Dict¶
-
classmethod
validate_data
(payload: Dict) → bool¶ validate_data method documentation
-
classmethod
Module contents¶
dazu.adapters.adapters package¶
Submodules¶
dazu.adapters.adapters.google module¶
-
class
dazu.adapters.adapters.google.
GoogleAdapter
¶ Bases:
dazu.adapters.adapter.Adapter
-
classmethod
input
(payload: Dict) → dazu.typing.message.Message¶
-
classmethod
name
()¶ The name property is a function of the class - its __name__.
-
classmethod
output
(message: dazu.typing.message.Message) → Dict¶
-
classmethod
validate_data
(payload: Dict) → bool¶ validate_data method documentation
-
classmethod
Module contents¶
dazu.components package¶
Subpackages¶
Submodules¶
dazu.components.component module¶
-
class
dazu.components.component.
Component
(component_config: Optional[Dict[str, Any]] = None)¶ Bases:
dazu.typing.module.Module
A component is a message processing unit in a pipeline. Components are collected sequentially in a pipeline. Each component is called one after another. This holds for initialization, training, persisting and loading the components. If a component comes first in a pipeline, its methods will be called first. E.g. to process an incoming message, the
process
method of each component will be called. During the processing (as well as the training, persisting and initialization) components can pass information to other components. The information is passed to other components by providing attributes to the so called pipeline context. The pipeline context contains all the information of the previous components a component can use to do its own processing. For example, a featurizer component can provide features that are used by another component down the pipeline to do intent classification.-
classmethod
cache_key
(component_meta: Dict[str, Any], model_metadata: dazu.typing.model.Metadata) → Optional[str]¶ This key is used to cache components. If a component is unique to a model it should return None. Otherwise, an instantiation of the component will be reused for all models where the metadata creates the same key.
-
classmethod
can_handle_language
(language: Hashable) → bool¶ Check if component supports a specific language. This method can be overwritten when needed. (e.g. dynamically determine which language is supported.)
-
classmethod
create
(component_config: Dict[str, Any], config: dazu.config.DazuConfig) → dazu.components.component.Component¶ Creates this component (e.g. before a training is started). Method can access all configuration parameters.
-
defaults
= {}¶
-
language_list
= None¶
-
classmethod
load
(component_config: Dict[str, Any], model_dir: Optional[str] = None, model_metadata: Optional[Metadata] = None, cached_component: Optional[Component] = None, **kwargs: Any) → dazu.components.component.Component¶ Load this component from file. After a component has been trained, it will be persisted by calling persist. When the pipeline gets loaded again, this component needs to be able to restore itself. Components can rely on any context attributes that are created by
components.Component.create()
calls to components previous to this one.
-
partially_process
(message: dazu.typing.message.Message) → dazu.typing.message.Message¶ Allows the component to process messages during training (e.g. external training data). The passed message will be processed by all components previous to this one in the pipeline.
-
persist
(file_name: str, model_dir: str) → Optional[Dict[str, Any]]¶ Persist this component to disk for future loading.
-
prepare_partial_processing
(pipeline: List[Component], context: Dict[str, Any]) → None¶ Sets the pipeline and context used for partial processing. The pipeline should be a list of components that are previous to this one in the pipeline and have already finished their training (and can therefore be safely used to process messages).
-
process
(message: dazu.typing.message.Message, **kwargs: Any) → None¶ Process an incoming message. This is the components chance to process an incoming message. The component can rely on any context attribute to be present, that gets created by a call to
rasa.nlu.components.Component.create()
of ANY component and on any context attributes created by a call torasa.nlu.components.Component.process()
of components previous to this one.
-
provide_context
() → Optional[Dict[str, Any]]¶ Initialize this component for a new pipeline This function will be called before the training is started and before the first message is processed using the interpreter. The component gets the opportunity to add information to the context that is passed through the pipeline during training and message parsing. Most components do not need to implement this method. It’s mostly used to initialize framework environments like MITIE and spacy (e.g. loading word vectors for the pipeline).
-
provides
= []¶
-
classmethod
required_packages
() → List[str]¶ Specify which python packages need to be installed to use this component, e.g.
["spacy"]
. More specifically, these should be importable python package names e.g. sklearn and not package names in the dependencies sense e.g. scikit-learn This list of requirements allows us to fail early during training if a required package is not installed.
-
requires
= []¶
-
train
(training_data: dazu.typing.training_data.TrainingData, cfg: dazu.config.DazuConfig, **kwargs: Any) → None¶ Train this component. This is the components chance to train itself provided with the training data. The component can rely on any context attribute to be present, that gets created by a call to
rasa.nlu.components.Component.create()
of ANY component and on any context attributes created by a call torasa.nlu.components.Component.train()
of components previous to this one.
-
classmethod
-
exception
dazu.components.component.
UnsupportedLanguageError
(component: str, language: str)¶ Bases:
Exception
Raised when a component is created but the language is not supported.
- Parameters
component – component name
language – language that component doesn’t support
dazu.components.engine module¶
-
class
dazu.components.engine.
Engine
(config: dazu.config.DazuConfig)¶ Bases:
object
-
components
: List[dazu.components.component.Component] = []¶
-
pipeline
: List[Type[dazu.components.component.Component]] = []¶
-
respond
(message, context={})¶
-
train
()¶
-
-
dazu.components.engine.
build_model_filename
(idx, componentCls: Type[dazu.components.component.Component])¶
Module contents¶
dazu.components.dialogue package¶
Submodules¶
dazu.components.dialogue.simple module¶
-
class
dazu.components.dialogue.simple.
SimpleDialogue
(component_config: Optional[Dict[str, Any]] = None, dialog_nodes: Dict = None)¶ Bases:
dazu.components.component.Component
-
classmethod
load
(meta: Dict[str, Any], model_dir: Optional[str] = None, model_metadata: Optional[Metadata] = None, cached_component: Optional[Component] = None, **kwargs: Any) → dazu.components.component.Component¶ Load this component from file. After a component has been trained, it will be persisted by calling persist. When the pipeline gets loaded again, this component needs to be able to restore itself. Components can rely on any context attributes that are created by
components.Component.create()
calls to components previous to this one.
-
classmethod
name
()¶ The name property is a function of the class - its __name__.
-
persist
(file_name: str, model_dir: str) → Optional[Dict[str, Any]]¶ Persist this component to disk for future loading.
-
process
(message: dazu.typing.message.Message, **kwargs: Any) → None¶ Process an incoming message. This is the components chance to process an incoming message. The component can rely on any context attribute to be present, that gets created by a call to
rasa.nlu.components.Component.create()
of ANY component and on any context attributes created by a call torasa.nlu.components.Component.process()
of components previous to this one.
-
train
(training_data: dazu.typing.training_data.TrainingData, cfg: dazu.config.DazuConfig, **kwargs: Any) → None¶ Train this component. This is the components chance to train itself provided with the training data. The component can rely on any context attribute to be present, that gets created by a call to
rasa.nlu.components.Component.create()
of ANY component and on any context attributes created by a call torasa.nlu.components.Component.train()
of components previous to this one.
-
classmethod
-
dazu.components.dialogue.simple.
evalCondition
(condition, context, intent, entities)¶
-
dazu.components.dialogue.simple.
get_dialog_anythinelse
(dialog_nodes)¶
-
dazu.components.dialogue.simple.
get_dialog_welcome
(dialog_nodes)¶
Module contents¶
dazu.components.nlu package¶
Submodules¶
dazu.components.nlu.simple module¶
-
class
dazu.components.nlu.simple.
SimpleNLU
(component_config: Optional[Dict[str, Any]] = None, intent_model: Dict = None)¶ Bases:
dazu.components.component.Component
-
classmethod
load
(meta: Dict[str, Any], model_dir: Optional[str] = None, model_metadata: Optional[Metadata] = None, cached_component: Optional[Component] = None, **kwargs: Any) → dazu.components.component.Component¶ Load this component from file. After a component has been trained, it will be persisted by calling persist. When the pipeline gets loaded again, this component needs to be able to restore itself. Components can rely on any context attributes that are created by
components.Component.create()
calls to components previous to this one.
-
classmethod
name
()¶ The name property is a function of the class - its __name__.
-
persist
(file_name: str, model_dir: str) → Optional[Dict[str, Any]]¶ Persist this component to disk for future loading.
-
process
(message: dazu.typing.message.Message, **kwargs: Any) → None¶ Process an incoming message. This is the components chance to process an incoming message. The component can rely on any context attribute to be present, that gets created by a call to
rasa.nlu.components.Component.create()
of ANY component and on any context attributes created by a call torasa.nlu.components.Component.process()
of components previous to this one.
-
train
(training_data: dazu.typing.training_data.TrainingData, cfg: dazu.config.DazuConfig, **kwargs: Any) → None¶ Train this component. This is the components chance to train itself provided with the training data. The component can rely on any context attribute to be present, that gets created by a call to
rasa.nlu.components.Component.create()
of ANY component and on any context attributes created by a call torasa.nlu.components.Component.train()
of components previous to this one.
-
classmethod
-
dazu.components.nlu.simple.
simmilarity
(a, b)¶
-
dazu.components.nlu.simple.
tokenize
(stopwords: List[str], sentence: str)¶
Module contents¶
dazu.training_data package¶
Subpackages¶
Submodules¶
dazu.training_data.reader module¶
-
class
dazu.training_data.reader.
Reader
¶ Bases:
dazu.typing.module.Module
-
abstract classmethod
load
(config: dazu.config.DazuConfig) → dazu.typing.training_data.TrainingData¶
-
classmethod
validate_data
(config: dazu.config.DazuConfig, data: Dict) → bool¶
-
abstract classmethod
Module contents¶
dazu.training_data.formats package¶
Submodules¶
dazu.training_data.formats.json module¶
-
class
dazu.training_data.formats.json.
JsonReader
¶ Bases:
dazu.training_data.reader.Reader
-
classmethod
load
(config: dazu.config.DazuConfig) → dazu.typing.training_data.TrainingData¶
-
classmethod
Module contents¶
dazu.typing package¶
Submodules¶
dazu.typing.message module¶
dazu.typing.model module¶
-
class
dazu.typing.model.
Metadata
(data)¶ Bases:
object
-
class
dazu.typing.model.
Model
(data)¶ Bases:
object
dazu.typing.module module¶
Module contents¶
dazu.utils package¶
Submodules¶
dazu.utils.io module¶
-
dazu.utils.io.
create_directory
(directory_path: str) → None¶ Creates a directory and its super paths. Succeeds even if the path already exists.
-
dazu.utils.io.
create_directory_for_file
(file_path: str) → None¶ Creates any missing parent directories of this file path.
-
dazu.utils.io.
create_path
(file_path: str) → None¶ Makes sure all directories in the ‘file_path’ exists.
-
dazu.utils.io.
create_temporary_file
(data: Any, suffix: str = '', mode: str = 'w+') → str¶ Creates a tempfile.NamedTemporaryFile object for data. mode defines NamedTemporaryFile’s mode parameter in py3.
-
dazu.utils.io.
dump_obj_as_json_to_file
(filename: str, obj: Any) → None¶ Dump an object as a json string to a file.
-
dazu.utils.io.
enable_async_loop_debugging
(event_loop: asyncio.events.AbstractEventLoop, slow_callback_duration: float = 0.1) → asyncio.events.AbstractEventLoop¶
-
dazu.utils.io.
fix_yaml_loader
() → None¶ Ensure that any string read by yaml is represented as unicode.
-
dazu.utils.io.
is_subdirectory
(path: str, potential_parent_directory: str) → bool¶
-
dazu.utils.io.
list_directory
(path: str) → List[str]¶ Returns all files and folders excluding hidden files. If the path points to a file, returns the file. This is a recursive implementation returning files in any depth of the path.
-
dazu.utils.io.
list_files
(path: str) → List[str]¶ Returns all files excluding hidden files. If the path points to a file, returns the file.
-
dazu.utils.io.
list_subdirectories
(path: str) → List[str]¶ Returns all folders excluding hidden files. If the path points to a file, returns an empty list.
-
dazu.utils.io.
read_config_file
(filename: str) → Dict[str, Any]¶ Parses a yaml configuration file. Content needs to be a dictionary :param filename: The path to the file which should be read.
-
dazu.utils.io.
read_file
(filename: str, encoding: str = 'utf-8') → Any¶ Read text from a file.
-
dazu.utils.io.
read_json_file
(filename: str) → Any¶ Read json from a file.
-
dazu.utils.io.
read_yaml
(content: str) → Union[List[Any], Dict[str, Any]]¶ Parses yaml from a text. :param content: A text containing yaml content.
-
dazu.utils.io.
read_yaml_file
(filename: str) → Union[List[Any], Dict[str, Any]]¶ Parses a yaml file. :param filename: The path to the file which should be read.
-
dazu.utils.io.
replace_environment_variables
() → None¶ Enable yaml loader to process the environment variables in the yaml.
-
dazu.utils.io.
unarchive
(byte_array: bytes, directory: str) → str¶ Tries to unpack a byte array interpreting it as an archive. Tries to use tar first to unpack, if that fails, zip will be used.
-
dazu.utils.io.
write_text_file
(content: str, file_path: Union[str, pathlib.Path], encoding: str = 'utf-8', append: bool = False) → None¶ Writes text to a file. :param content: The content to write. :param file_path: The path to which the content should be written. :param encoding: The encoding which should be used. :param append: Whether to append to the file or to truncate the file.
-
dazu.utils.io.
write_yaml_file
(data: Dict, filename: Union[str, pathlib.Path]) → None¶ Writes a yaml file. :param data: The data to write. :param filename: The path to the file which should be written.
-
dazu.utils.io.
zip_folder
(folder: str) → str¶ Create an archive from a folder.